This project simulates data analysis for Arbor Foods Trading Co., a fictional wholesale food distributor. The goal is to showcase SQL-based analytical workflows using PostgreSQL, including:
- Common Table Expressions (CTEs)
- Window Functions (
RANK
,LAG
,AVG OVER
) - Sales and customer analysis
- Employee performance metrics
The project includes a full PostgreSQL setup, a Jupyter notebook with analysis, and a lightweight SQLite-powered Binder demo for live viewing.
- Project Structure
- Scripts
- Live Demo (via Binder)
- Running the Full PostgreSQL Version
- License
- Acknowledgments
postgres-arborfoods/
├── binder/ # Binder demo environment (SQLite-based)
│ ├── binder-demo.ipynb # SQLite version of the notebook
│ ├── arborfoods.db # Auto-generated SQLite database
│ ├── data/ # CSVs generated from PostgreSQL
│ │ ├── suppliers.csv
│ │ ├── customers.csv
│ │ └── ...
│ ├── postBuild # Script to build arborfoods.db from CSVs
│ ├── requirements.txt # Python dependencies for Binder
│ ├── runtime.txt # Python version for Binder
│ └── setup_sqlite.py # Loads CSVs into SQLite (arborfoods.db)
│
├── data/ # PostgreSQL database dumps
│ ├── arborfoods_dump.sql # Optional: SQL dump (for psql restore)
│ └── arborfoods_dump.tar # SQL binary dump (for pg_restore)
│
├── docker/ # Docker Compose setup for PostgreSQL
│ └── docker-compose.yml
│ └── postgres.env.example
│
├── notebook/
│ └── postgresql-arborfoods.ipynb # Full PostgreSQL notebook
│
├── scripts/ # Utility scripts
│ └── export_postgres_tables.py # Exports all PostgreSQL tables to CSV
│
├── LICENSE # MIT license for code/scripts
├── pyproject.toml # Project dependencies (managed with uv)
├── uv.lock # Locked dependency versions
├── README.md # Project overview and instructions
└── .gitignore # Standard ignore file
This project includes two utility scripts that support reproducibility and Binder compatibility:
Exports all tables from the PostgreSQL arborfoods_db
database into individual CSV files using pandas
and SQLAlchemy
.
- Output location:
binder/data/
- Run from the project root:
python scripts/export_postgres_tables.py
Loads the exported CSV files into a local SQLite database (arborfoods.db) for use in the Binder demo notebook.
To prepare the SQLite database manually (for local testing or rebuilding the demo), run:
python binder/setup_sqlite.py
You can explore a demo version of this project using SQLite, directly in your browser — no setup required.
This project includes a docker-compose.yml
file and a .env
file to spin up a PostgreSQL database in a containerized environment with minimal configuration.
Note: The Docker Compose file is configured to:
- Create a container named
postgresql_arborfoods
- Create a PostgreSQL database using values defined in
docker/postgres.env
- Expose PostgreSQL on port
5432
- Persist data using a named Docker volume
Edit the docker/postgres.env
file before launching:
POSTGRES_USER=userNameHere
POSTGRES_PASSWORD=userPasswordHere
POSTGRES_DB=arborfoods_db
Variable | Description |
---|---|
POSTGRES_USER |
Username to access the PostgreSQL database |
POSTGRES_PASSWORD |
Password for the database user |
POSTGRES_DB |
Name of the database to be created |
git clone https://github.com/xjwllmsx/postgres-arborfoods.git
cd postgres-arborfoods
Copy the example .env
file and update the values:
cp docker/postgres.env.example docker/postgres.env
Then edit docker/postgres.env
to set your desired credentials:
POSTGRES_USER=your_username
POSTGRES_PASSWORD=your_password
POSTGRES_DB=arborfoods_db
From the docker/
directory, start the container using your environment file:
cd docker
docker compose --env-file ./postgres.env up -d
This will launch a PostgreSQL container named postgresql_arborfoods
using the credentials you defined in postgres.env
.
Return to the project root and run:
# Copy the .tar file into the container
docker cp ./data/arborfoods_dump.tar postgresql_arborfoods:/arborfoods.tar
# Execute the SQL script inside the container
docker exec -it postgresql_arborfoods psql -U your_username -d your_database_name -f /arborfoods.tar
Replace:
your_username
with the value ofPOSTGRES_USER
your_database_name
with the value ofPOSTGRES_DB
Navigate to:
notebooks/postgresql-arborfoods.ipynb
Inside the notebook, connect to your Postgres DB using:
%sql postgresql://your_username:your_password@localhost:5432/your_database_name
This project is dual-licensed:
- Code, SQL scripts, and setup files are licensed under the MIT License.
- Notebook markdown, analysis commentary, and educational content are licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0).
The dataset used in this project is based on the Northwind PostgreSQL dataset, adapted for demonstration purposes.