XXUOM Pipeline Simulation

This project simulates a data pipeline that ingests CRM task data from a CSV file (representing an Oracle CRM source table UOM_GAPS.XXUOM_TASKS) into a Bronze Lakehouse layer, performs transformations, and outputs a curated dataset to the Gold Lakehouse layer. The pipeline is built with Python and Pandas, and follows DevOps-aligned principles including CI/CD and automated System Integration Testing (SIT).

🚀 Features

📥 Copy Data activity simulation (Bronze Lakehouse)
🔄 Transform data with business logic (Gold Lakehouse)
✅ Automated validation and data profiling (SIT)
🔁 GitHub Actions CI/CD pipeline
🧪 Testable, modular Python codebase

🛠️ How to Run Locally

Clone the repository

git clone https://github.com/profdean3matrix/xxuom_pipeline_simulation.git
cd xxuom_pipeline_simulation

Create and activate a virtual environment

python -m venv .venv
source .venv/Scripts/activate  # or .venv/bin/activate on Mac/Linux

Install required packages
```
pip install pandas pyarrow
```

Run simulation scripts

Copy to Bronze:
```
python copy_to_bronze.py
```
Transform to Gold:
```
python transform_to_gold.py
```
Run SIT Validation:
```
python sit_validate_gold_output.py
```

📂 Project Structure

├── bronze_lakehouse/
│   └── CRM_UOM_TASKS.parquet
├── gold_lakehouse/
│   └── CRM_TASKS_COMPLAINTS.parquet
├── UOM_GAPS_XXUOM_TASKS.csv          # Source data
├── copy_to_bronze.py                 # Ingest from CSV to Bronze
├── transform_to_gold.py             # Transform Bronze → Gold
├── sit_validate_gold_output.py      # Validate Gold layer (SIT)
├── .github/workflows/               # CI/CD pipeline
│   └── python-sit-validation.yml
└── README.md

🤖 DevOps & CI/CD

GitHub Actions pipeline automatically runs sit_validate_gold_output.py on every push/PR to main
Validation checks for schema compliance, duplicates, nulls, and record quality
Status badge embedded at the top of this file

👤 Author

Dean @profdean3matrix
Multiverse Data Academy – Module 2: DevOps Integration in Microsoft Fabric

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
bronze_lakehouse		bronze_lakehouse
gold_lakehouse		gold_lakehouse
tests		tests
.gitignore		.gitignore
README.md		README.md
UOM_GAPS_XXUOM_TASKS.csv		UOM_GAPS_XXUOM_TASKS.csv
copy_to_bronze.ipynb		copy_to_bronze.ipynb
copy_to_bronze.py		copy_to_bronze.py
create_semantic_workflow.py		create_semantic_workflow.py
prepare_semantic_model.py		prepare_semantic_model.py
requirements.txt		requirements.txt
sit_validate_gold_output.py		sit_validate_gold_output.py
sit_verify.py		sit_verify.py
transform_to_gold.py		transform_to_gold.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

XXUOM Pipeline Simulation

🚀 Features

🛠️ How to Run Locally

📂 Project Structure

🤖 DevOps & CI/CD

👤 Author

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

profdean3matrix/xxuom_pipeline_simulation

Folders and files

Latest commit

History

Repository files navigation

XXUOM Pipeline Simulation

🚀 Features

🛠️ How to Run Locally

📂 Project Structure

🤖 DevOps & CI/CD

👤 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages