-
Notifications
You must be signed in to change notification settings - Fork 55
Description
I tried replicating results for zero-shot learning on CLS, but my results don't match those from the paper. Since the script for predicting labels with LASER seems not be a part of Multifit repository I trained LASER on the CLS dataset (only en and de books for now) by adjusting the MLDoc script from LASER repo to CLS. My fork of LASER with these adjustment is [here]h(ttps://github.com/blazejdolicki/LASER). For the time being I only tested on books in German. After some hyperparameter tuning performed on English training set, my best setup obtains 82.25% accuracy compared to 84.15% from the Multifit paper. My hyperparams are:
n_epochs=200
lr=0.001
wd=0.0
nhid="10 8"
drop=0.2
seed=1
bsize=12
and I'm using the last 10% of the test set as validation.
When I tried to make them more similar to Multifit (n_epochs=8, wd=0.001,bsize=18), the accuracy dropped to around 60%.
Afterwards, I used the best (82.25% acc) LASER classifier (trained on English training set) to predict labels for German books. Then I copied test, training and unsupervised sets in Multifit repo from folder de-books into de-books-laser and replaced ground truth labels in training set with pseudolabels. Afterwards I trained the Multifit classifier on those pseudolabels and while my validation accuracy isn't great but at least similar, my test set accuracy is as low as 70% (compared to 89.60 from the paper and here) as you can see in the attached logs.
Multifit CLS zero shot terrible results 15.04.2020.txt
I did expect some drop due to the issue explained in #63, but such big difference shows that the unsupervised set size can't be the only factor deteriorating the results. Other possible reason of the drop in performance that come to my mind are:
- I used different hyperparameters for training and predicting LASER pseudolabels?
- I used different train-dev split for training and predicting LASER pseudolabels?
- your script was loading the LASER model with fastai library and training the classifier with it instead of Pytorch ?
My fork of mutlifit is here, I'm using the ulmfit-original-scripts branch.
I would really appreciate a reply :)