A Movie Review Dataset is used, which can be dowloaded from here: http://ai.stanford.edu/~amaas/data/sentiment/
- Training set: Balanced set of 10,000 positive reviews and 10,0000 negative reviews
- Validation set: Balanced set of 2,500 positive reviews and 2,500 negative reviews
- Glove pre-trained word embeddings were used as features, which can be downloaded at https://nlp.stanford.edu/projects/glove/
After downloading the 'glove.6B.zip' file, add 'glove.6B.300d.txt' in the same directory to reproduce the results as shown in the ipython notebook - An LSTM with 60 neurons was used. To reduce overfitting, a dropout layer was added.
- Test set: Balanced set of 12,500 positive reviews and 12,500 negative reviews
- Test Set Accuracy: 0.85476