Skip to content

NormanLo4319/Restaurant-Health-Score-vs.-Yelp-Customer-Based-Scores-ETL-Project

Repository files navigation

ETL Project

ETL (Extract, Transform, and Load) is a standard workflow in data analytic and data science. The project demonstrates the use of different data sources and combine them to extract meaningful insights.

Objective

How much you know about your favorite restaurant in San Francisco? In this project, we are tyring to compare Yelp customer-based rating and SF public restaurants scores (cleaness). The project is based on ETL workflow., which involves EXTRACTING, TRANSFORMING, and LOADING data. We first made the API query from Yelpand received 4,050 restaurant rating scores and stored in the SQL database. We then downloaded the public restaurant scores from San Francisco Department of Public Health and stored in the same database. Once the database is created, we used Pythong SQLAlchemy library to merge the two relational data. We used Python MatPlotlib & Plotly libraries to create visualization of the data. According to the result, we do not find any strong evidence suggest the correlation bewteen customer-based rating and restaurant scores.

restaurant

Copyright © 2020 Norman Lo

About

Extract, Transform, and Load Project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published