This project analyzes Yelp business data to uncover patterns in restaurant ratings across different cities, price levels, and restaurant types.
- Identify the top cities by average Yelp rating
- Explore the relationship between price level and rating
- Highlight the most common restaurant types
- Visualize regional rating patterns using heatmaps
This project uses the Yelp Open Dataset, provided by Yelp as part of their Academic Dataset program.
The dataset primarily contains business, review, and user data from cities in the United States (e.g., Phoenix, Las Vegas, Charlotte). A few entries may also come from Canada (e.g., Toronto) or the United Kingdom (e.g., Edinburgh), but these are limited and may vary by dataset version.
All data is for academic research purposes only and complies with Yelp's data sharing policy.
File | Description |
---|---|
analysis.py |
Main notebook with all visualizations |
clean_business.py |
Script to preprocess Yelp data |
yelp_academic_dataset_business.json.zip |
Raw dataset (compressed) |
yelp_cleaned.csv.zip |
Cleaned dataset (compressed) |
images/ |
Folder containing all visualization images |
- Python (pandas, seaborn, matplotlib, numpy)
- Jupyter Notebook
- Clone the repository
- Unzip datasets
- Run
clean_business.py
to generate the cleaned dataset. - Open
analysis.ipynb
in Jupyter Notebook and run all cells.
- Python 3.10+
- pandas, seaborn, matplotlib, folium