The goal of this code is to classify Amazon reviews as positive or negative.
The ourline is as follows:
- some basic descriptive statistics
- data cleaning: deleting punctuation and stopwords
- generation of word clouds
- tokenization of the reviews for sentiment analysis
- Apply 3 different models and compare the scores
-
- Naive Bayesian Model
-
- Logistic Regression Model
-
- Gradient Boosting model
- this latter model also has hyperparameter optimization with a grid search algo