Skip to content

udbj/xtreme-multilabel-tweets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 

Repository files navigation

xtreme-multilabel-tweets

Classification of tweets by 10,000 users

  • Dataset can be downloaded from https://www.dropbox.com/s/kn2dmuczse0ysek/train_tweets.txt.zip?dl=0
  • Like most extreme multiclass/label classification problems, the dataset has a heavily skewed distribution and the final validation accuracy is low

FastText

  • Uses the fasttext library by Facebook for classfication of BERT-encoded tweets
  • Achieves accuracy of 16.33% on validation set

FastXML

  • Uses the Refefer/fastxml library to run the PFastreXML text classifier on BERT-encoded tweets
  • Achieves accuracy of 22.00% on validation set

About

Classification of tweets by 10,000 users

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published