Portfolio project demonstrating modern data engineering: Medallion architecture, and ETL.
This is my implementation of a Data Warehouse & Analytics solution, built by following and recreating a GitHub project as a hands-on learning exercise.
It demonstrates how to design a modern data warehouse, build ETL pipelines, create analytical data models, and generate insights โ all following data engineering best practices.
- Data Architecture: Implemented the Medallion Architecture (Bronze, Silver, Gold layers).
- ETL Pipelines: Extracted, transformed, and loaded CSV datasets (ERP + CRM) into SQL Server.
- Data Modeling: Designed fact and dimension tables in a star schema format.
- Analytics & Reporting: Wrote SQL queries and created dashboards for insights.
-
Clone this repo
``bash git clone https://github.com//sql_data-warehouse-project.git cd data-warehouse-project -
Set up SQL Server
-
Create a new database SalesDW
-
Run the scripts in /scripts/bronze to load raw data
-
Apply transformations from /scripts/silver
-
Build fact & dimension tables from /scripts/gold
- Check Documentation
- Open /docs/ for naming convention, data catalog, data flow and data models diagrams.
- Run Analytics Queries
- Example queries are provided in /scripts/gold/ for reporting.
๐ Repository Structure data-warehouse-project/ โ โโโ datasets/ # ERP & CRM sample data (CSV format) โโโ docs/ # Documentation & diagrams โโโ scripts/ # SQL ETL scripts (bronze, silver, gold layers) โโโ tests/ # Data validation scripts โโโ README.md # Project overview
๐ Screenshots & Diagrams
๐ What I Learned
Building a layered data architecture for structured pipelines.
-
The importance of data cleaning and validation.
-
Designing star schema data models for efficient analytics.
-
Documenting with data catalogs, ERDs, and naming conventions.
๐ฎ Next Steps
-
Automate ETL with Python or Airflow
-
Deploy on AWS/Azure cloud
-
Add interactive Power BI / Tableau dashboards
๐ Acknowledgements
This project was inspired by a GitHub data engineering project that I followed and rebuilt for learning purposes.