This is a full-stack data engineering + analytics project with:
- ETL using Pyspark
- Fake Order Generator using Python + Faker
- PySpark ETL pipeline to clean and enrich raw data
- Flask dashboard with Bootstrap for styling
- Dynamic Charts using Chart.js
- View data sources in tabular form
- Fully Dockerized
Component | Tech Used |
---|---|
Ingestion | Python + Faker |
ETL | PySpark |
Analytics Engine | Pandas, Spark SQL |
Backend | Flask |
Charts | Chart.js |
Styling | Bootstrap |
Deployment | Docker + Render |