This project is a multi-platform sentiment analysis and hate speech detection system that collects, analyzes, and visualizes data from Reddit and YouTube. It is implemented using a crawler, an analysis pipeline, and a web-based visualization tool to extract insights from public discussions.
- Python – Core implementation language.
- Hadoop & MapReduce – Large-scale data processing.
- Flask – Web framework for visualization.
- PostgreSQL – Database for storing processed data.
- Matplotlib & Pandas – Data visualization.
- Bootstrap & HTML/CSS – Frontend UI.
✅ Crawls Data from Reddit & YouTube.
✅ Sentiment Analysis using NLP.
✅ Hate Speech Detection API Integration.
✅ Data Storage in PostgreSQL for Processing.
✅ Web Dashboard for Insights & Visualization.
📁 Main-Project/
│── 📁 Crawler/ # Scrapes data from Reddit & YouTube
│ │── Reddit_updated.py
│ │── Youtube_final_updated.py
│ │── subreddits.csv
│ │── Youtube_key.csv
│ └── README.md
│
│── 📁 Analysis/ # Processes collected data
│ │── an_all.py
│ │── Analysis.yt.py
│ │── plot.ipynb
│ └── README.md
│
│── 📁 Web/ # Visualizes results
│ │── app.py
│ │── templates/
│ │── static/
│ └── README.md
│
│── README.md # Project documentation (this file)
- Scrapes comments from Reddit & YouTube based on specified topics.
- Stores raw text data in PostgreSQL.
- Cleans & processes the text data.
- Performs sentiment analysis & hate speech detection.
- Stores structured results for further use.
- Retrieves processed sentiment & hate speech data.
- Provides interactive charts & date-based filtering.
- Enables easy exploration of online discussions.
python -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts�ctivate # Windows
pip install -r requirements.txt
cd Crawler
python Reddit_updated.py
python Youtube_final_updated.py
cd Analysis
python an_all.py
python Analysis.yt.py
cd Web
python app.py
Open in a web browser:
http://127.0.0.1:5000/
🔹 Expand to More Platforms (Twitter, News, Forums).
🔹 Deploy Web App to AWS, GCP, or Heroku.
🔹 Real-Time Streaming of Sentiment Analysis.
🔹 Train a Custom AI Model for Hate Speech Detection.
Author: Siddartha Reddy Boreddy
📍 SUNY Binghamton
✉️ Email: sboreddy@binghamton.edu