The Yelp dataset is too large to host on GitHub.
- Download, unzip, and untar from Yelp Open Dataset.
- Create a
data/
folder inside your project directory: - Place the following JSON files into
yelp-loader/data/
:
business.json
review.json
tip.json
user.json
You can do this by running this within the /Yelp JSON
folder, but make sure your filepaths are correct:
mv yelp_academic_dataset_business.json ~/412-project/yelp-loader/data/business.json
mv yelp_academic_dataset_user.json ~/412-project/yelp-loader/data/user.json
mv yelp_academic_dataset_review.json ~/412-project/yelp-loader/data/review.json
mv yelp_academic_dataset_tip.json ~/412-project/yelp-loader/data/tip.json
Make sure Python 3 and pip are installed:
python3 --version
pip --version
apt install python3.12-venv
ensure you are in the project folder (412-project/yelp-loader) for below command:
python3 -m venv venv
source venv/bin/activate
to leave:
deactivate
only do this after last command
pip install psycopg2-binary --break-system-packages
export PATH=$PATH:/lib/postgresql/16/bin
export PGPORT=8888
export PGHOST=/tmp
initdb $HOME/dbProject
pg_ctl -D $HOME/dbProject -o '-k /tmp' start
replace USERNAME
with your system username (bash: whoami)
make full