Skip to content

Datasets

Nils Feldhus edited this page Jul 1, 2021 · 7 revisions

Datasets

SST-2

SST-2 is a sentiment analysis dataset with 2 classes and part of the glue benchmark. There are no labels available for the test set.

Name 🤗 lgxa lig lime occ svs
ALBERT (albert) textattack/albert-base-v2-SST-2
BERT (bert) textattack/bert-base-uncased-SST-2
ELECTRA (electra) howey/electra-base-sst2
RoBERTa (roberta) textattack/roberta-base-SST-2
XLNet (xlnet) textattack/xlnet-base-cased-SST-2

QQP

QQP is a paraphrase identification dataset of two classes, contains 390965 examples, and is part of the glue benchmark.

Name 🤗 lgxa lig lime occ svs
ALBERT textattack/albert-base-v2-QQP
BERT textattack/bert-base-uncased-QQP
ELECTRA howey/electra-base-qqp
XLNet textattack/xlnet-base-cased-QQP

TREC

trec is a question classification dataset with 6 classes.

Name 🤗 Tested
BERT aychang/bert-base-cased-trec-coarse
DistilBERT aychang/distilbert-base-cased-trec-coarse
Clone this wiki locally