Skip to content

Himanshu-Pandey-04/Twitter_Sentiment_Analysis

Repository files navigation

Sentiment Analysis on Twitter Tweets



Sentiment Analysis is conducted on various datasets after exploratory data analysis and data preprocessing, separately using variety of Machine Learning techniques

15 implemented ML Algorithms :

  1. Logistic Regression
    • Newton CG
    • SAG
    • SAGA
    • LBFGS
  2. Decision Tree Classifier
  3. Support Vector Machines
    • Linear
    • Poly
    • RBF
    • Sigmoid
  4. Majority Voting Ensemble
  5. Extreme Laerning Machines
    • Tanh
    • SinSQ
    • Tribas
    • Hardlim
  6. Artificial Neural Networks (Multi - Layer Perceptron) Gradient Descent



Part 1 : Data Preprocessing

  1. Exploratory Data Analysis
  2. Data Preprocessing
  3. Cleaning
  4. Lemmatization

>>> Run Text_Preprocessing_MP_Hybrid.py if you want to Preprocess some datasets




Flow of Control

  1. Sentence Segmentation
  2. Word Tokenization
  3. Same consecutive chars changed to max 2 times
  4. Spelling Corrections
  5. Removal of #Hashtags, @Mentions, http//:URLs, etc (Noise 1)
  6. Removal of Special Unicode Characters (Noise 2)
  7. Chat Abbreviations conversions (Noise 3)
  8. Removal of Punctuations except `'` (Noise 4)
  9. Stop Words Removal (Noise 5)
  10. Parts of Speech Tagging
  11. Stemming & Lemmatization
  12. WhiteSpace Removals
  13. Chunking



Part 2 : Machine Learning Models Training


Datasets

Value Counts

  • -1 : 15236
  • 0 : 9465
  • 1 : 15299

Value Counts

  • -1 : 1117
  • 0 : 1570
  • 1 : 1186

Value Counts

  • 0 : 29720
  • 1 : 2242

Value Counts

  • -1 : 35510
  • 0 : 55213
  • 1 : 72250

About

Twitter Sentiment Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published