-
TFIDF for Reuters Articles in XML
- Extracted titles and paragraphs from Reuters articles in the XML format using ElementTree
- Tokenized and stemmed texts with NLTK, and determined TFIDF of the most common words using TfidfVectorizer
-
Sentiment Analysis with Naive Bayes using PySpark
- Performed data cleaning and transformation, and estimate TF using PySpark RDD
- Built a Naive Bayes model to perform sentiment analysis and achieved an accuracy of 82.5%
-
Sentiment Analysis of Tweets
- Parsed and stemmed tweet texts, and determined TF of the most common words
- Classified the tweet sentiment using the regularized logistic regression, LDA, and KNN
-
Notifications
You must be signed in to change notification settings - Fork 1
zzhangusf/NLP-projects
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published