NLP-Project-1-Twitter-Sentiment-Analysis-WebApp is a natural language processing (NLP) project that analyzes the sentiment of tweets in real time. The web app classifies tweets as positive, negative, or neutral, providing insights into public opinion or feedback regarding specific topics or hashtags. The project uses machine learning techniques and popular NLP libraries to perform text preprocessing, vectorization, and sentiment classification.
- Project Overview
- Features
- Technologies Used
- Dataset
- Installation
- Usage
- Modeling Approach
- Web App Interface
- Future Enhancements
- License
- Real-time sentiment analysis of tweets.
- Classifies tweets into positive, negative, or neutral categories.
- User can search tweets by a hashtag or keyword.
- Interactive web interface built using Streamlit.
- Visualizes results with charts showing the sentiment distribution.
- Python: Core programming language used.
- Natural Language Processing (NLP): For text analysis and sentiment classification.
- Streamlit: For building an interactive web app.
- Scikit-learn: Machine learning library used for model building.
- Tweepy: For fetching real-time tweets using the Twitter API.
- Numpy & Pandas: For data manipulation and preprocessing.
- Matplotlib & Seaborn: For data visualization.
The dataset for training the sentiment analysis model is sourced from:
- Publicly available Twitter sentiment datasets.
- The model is trained on a labeled dataset of tweets categorized as positive, negative, or neutral.
You can use your own dataset or fetch real-time tweets via the Twitter API using Tweepy.
-
Clone this repository:
git clone https://github.com/your-username/NLP-Project-1-Twitter-Sentiment-Analysis-WebApp.git cd NLP-Project-1-Twitter-Sentiment-Analysis-WebApp
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up your Twitter Developer Account and generate API keys for Tweepy.
-
Create a
.env
file in the project directory with your Twitter API credentials:TWITTER_API_KEY=your_api_key TWITTER_API_SECRET_KEY=your_api_secret_key TWITTER_ACCESS_TOKEN=your_access_token TWITTER_ACCESS_TOKEN_SECRET=your_access_token_secret
-
Run the app:
streamlit run app.py
- Open the web app on your browser (default address is
http://localhost:8501
). - Enter a keyword or hashtag to analyze the sentiment of recent tweets.
- The app will fetch the latest tweets related to the entered keyword, preprocess them, and classify them into positive, negative, or neutral sentiments.
- View the sentiment distribution graph for better understanding of public opinion.
- Text Preprocessing: Tweets are cleaned by removing URLs, special characters, and stopwords. Tokenization is applied to split the text into individual words.
- Vectorization: We use TF-IDF Vectorizer to convert text data into numerical format.
- Model: A Logistic Regression model is used to classify the sentiment of the tweets. Other models such as Naive Bayes and Support Vector Machine (SVM) can also be tested for comparison.
The web app includes:
- An input field for entering a keyword or hashtag.
- A button to fetch and analyze tweets.
- A bar chart showing the sentiment distribution across positive, negative, and neutral tweets.
- Implement advanced NLP models like BERT or LSTM for improved sentiment classification.
- Add support for multilingual sentiment analysis.
- Include historical sentiment tracking for a topic over time.
- Implement user authentication for customized experience.
This project is licensed under the MIT License. See the LICENSE file for more details.