Skip to content

This Python script uses the PRAW API to search Reddit for posts related to "AI in education" across specified subreddits. It collects post metadata, formats the data with timestamps and full text, and saves it as a structured CSV file.

Notifications You must be signed in to change notification settings

epsilon003/reddit-sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📊 Reddit Sentiment Analysis: AI in Education

This project collects Reddit posts related to "AI in Education" using the PRAW API and prepares them for sentiment and time series analysis. It supports multiple subreddits and stores structured data in CSV format.


🧾 Project Structure

reddit-sentiment-ai-education/
│
├── data/
│   ├── raw/
│   └── cleaned/
├── src/
│   ├── collect.py
│   ├── cleaning.py
│   └── sentiment.py
├── visuals/
|   └── visualise.py
├── README.md
├── requirements.txt
└── .gitignore

🔐 .env Template

Copy this to a file named .env in the root directory and fill in your Reddit app credentials:

REDDIT_CLIENT_ID=your_client_id_here
REDDIT_CLIENT_SECRET=your_client_secret_here
REDDIT_USER_AGENT=script by u/your_reddit_username

👉 Never share your .env file or push it to GitHub. It's excluded by .gitignore.


⚙️ Setup Instructions

  1. Clone this repository:
git clone https://github.com/your-username/reddit-sentiment-analysis.git
cd reddit-data-analysis
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up your environment:
  • Create a .env file using the template above.

🚀 Run the Script

Run the main script to fetch Reddit posts:

python collect.py

This will:

  • Search Reddit for posts matching "AI in education"
  • Across subreddits: education, artificial, technology
  • Save the output to data/raw/reddit_posts.csv

🧪 Output Example

Sample columns in the CSV:

  • title
  • selftext
  • created_utc
  • score
  • subreddit
  • url
  • full_text

You can now use this data for:

  • Sentiment analysis using VADER
  • Time series analysis with Matplotlib or Seaborn

📎 License

MIT License


🙋‍♂️ Questions or Contributions?

Feel free to open an issue or pull request. Collaboration is welcome!

About

This Python script uses the PRAW API to search Reddit for posts related to "AI in education" across specified subreddits. It collects post metadata, formats the data with timestamps and full text, and saves it as a structured CSV file.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages