Skip to content

AkuaDankwaa/sql_data_warehouse_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

22 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

sql_data_warehouse_project

Portfolio project demonstrating modern data engineering: Medallion architecture, and ETL.

๐Ÿ—„๏ธ Data Warehouse & Analytics Project

SQL
ETL
Data-Modeling
Status

This is my implementation of a Data Warehouse & Analytics solution, built by following and recreating a GitHub project as a hands-on learning exercise.

It demonstrates how to design a modern data warehouse, build ETL pipelines, create analytical data models, and generate insights โ€” all following data engineering best practices.


๐Ÿ“– Project Overview

  • Data Architecture: Implemented the Medallion Architecture (Bronze, Silver, Gold layers).
  • ETL Pipelines: Extracted, transformed, and loaded CSV datasets (ERP + CRM) into SQL Server.
  • Data Modeling: Designed fact and dimension tables in a star schema format.
  • Analytics & Reporting: Wrote SQL queries and created dashboards for insights.

๐Ÿš€ How to Run

  1. Clone this repo
    ``bash git clone https://github.com//sql_data-warehouse-project.git cd data-warehouse-project

  2. Set up SQL Server

  • Create a new database SalesDW

  • Run the scripts in /scripts/bronze to load raw data

  • Apply transformations from /scripts/silver

  • Build fact & dimension tables from /scripts/gold

  1. Check Documentation
  • Open /docs/ for naming convention, data catalog, data flow and data models diagrams.
  1. Run Analytics Queries
  • Example queries are provided in /scripts/gold/ for reporting.

๐Ÿ“‚ Repository Structure data-warehouse-project/ โ”‚ โ”œโ”€โ”€ datasets/ # ERP & CRM sample data (CSV format) โ”œโ”€โ”€ docs/ # Documentation & diagrams โ”œโ”€โ”€ scripts/ # SQL ETL scripts (bronze, silver, gold layers) โ”œโ”€โ”€ tests/ # Data validation scripts โ””โ”€โ”€ README.md # Project overview

๐Ÿ“Š Screenshots & Diagrams

data_model drawio

๐Ÿ“ What I Learned

Building a layered data architecture for structured pipelines.

  • The importance of data cleaning and validation.

  • Designing star schema data models for efficient analytics.

  • Documenting with data catalogs, ERDs, and naming conventions.

๐Ÿ”ฎ Next Steps

  • Automate ETL with Python or Airflow

  • Deploy on AWS/Azure cloud

  • Add interactive Power BI / Tableau dashboards

๐Ÿ™Œ Acknowledgements

This project was inspired by a GitHub data engineering project that I followed and rebuilt for learning purposes.

About

End-to-end data warehouse solution on SQL Server: ETL, data modelling, and analytics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages