This project focuses on performing sentiment analysis on Twitter data. It involves tasks such as data cleaning, preprocessing, and building a machine learning model to classify sentiments as positive or negative.
The following tools and libraries are required to run this project:
- Python 3.x
- pandas
- numpy
- scikit-learn
- nltk
-
Clone the repository:
git clone https://github.com/YourUsername/sentiment_analysis_on_twitter_data.git
-
Navigate to the project directory:
cd sentiment_analysis_on_twitter_data
-
Install the required Python packages:
pip install -r requirements.txt
-
Download necessary NLTK resources (stopwords, punkt):
import nltk nltk.download('stopwords') nltk.download('punkt')
-
Prepare your dataset:
- Place your dataset in the project directory and rename it to
twitter_data.csv
.
- Place your dataset in the project directory and rename it to
-
Run the preprocessing script:
- Clean and preprocess the text data for model training.
-
Split the data:
- Split the data into training and testing sets.
-
Train the model:
- Build and train a machine learning model using
Logistic Regression
or any other classifier.
- Build and train a machine learning model using
-
Evaluate the model:
- Test the model on unseen data and calculate accuracy metrics.
This project is licensed under the MIT License - see the LICENSE file for details.