xtreme-multilabel-tweets

Classification of tweets by 10,000 users

Dataset can be downloaded from https://www.dropbox.com/s/kn2dmuczse0ysek/train_tweets.txt.zip?dl=0
Like most extreme multiclass/label classification problems, the dataset has a heavily skewed distribution and the final validation accuracy is low

Uses the fasttext library by Facebook for classfication of BERT-encoded tweets
Achieves accuracy of 16.33% on validation set

Uses the Refefer/fastxml library to run the PFastreXML text classifier on BERT-encoded tweets
Achieves accuracy of 22.00% on validation set

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
fasttext		fasttext
fastxml		fastxml
README.md		README.md

Provide feedback