Skip to content

AlePam109/412

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Yelp Loader – CS412 Project

Add the Data Folder

The Yelp dataset is too large to host on GitHub.

  1. Download, unzip, and untar from Yelp Open Dataset.
  2. Create a data/ folder inside your project directory:
  3. Place the following JSON files into yelp-loader/data/:
  • business.json
  • review.json
  • tip.json
  • user.json

You can do this by running this within the /Yelp JSON folder, but make sure your filepaths are correct:

mv yelp_academic_dataset_business.json ~/412-project/yelp-loader/data/business.json
mv yelp_academic_dataset_user.json ~/412-project/yelp-loader/data/user.json
mv yelp_academic_dataset_review.json ~/412-project/yelp-loader/data/review.json
mv yelp_academic_dataset_tip.json ~/412-project/yelp-loader/data/tip.json

Installations

Make sure Python 3 and pip are installed:

python3 --version
pip --version
if 3.12: (use your version)
apt install python3.12-venv
creates a managed python env:

ensure you are in the project folder (412-project/yelp-loader) for below command:

python3 -m venv venv
starts it
source venv/bin/activate

to leave:

deactivate
install the python -> psql scripting tool

only do this after last command

pip install psycopg2-binary --break-system-packages

Database Initialization

Create the psql database cluster and launch the server locally
export PATH=$PATH:/lib/postgresql/16/bin
export PGPORT=8888
export PGHOST=/tmp
initdb $HOME/dbProject
pg_ctl -D $HOME/dbProject -o '-k /tmp' start

Set up and run the makefile

modify 412-Project/yelp-loader/populate_db.py

replace USERNAME with your system username (bash: whoami)

if JSON files properly added, below command will work (takes some time)
make full

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published