This repository contains the completed project for the LinkedIn Learning course "End-to-End Data Engineering Project". The course is available from LinkedIn Learning.
The world of data engineering is constantly evolving, with new tools and technologies emerging regularly. Building an effective analytics platform can be challenging, especially if you are unfamiliar with all the available tools. This course, led by Thalia Barrera, guides data professionals through the implementation of an end-to-end data engineering project using modern open-source tools. The course covers best practices in data modeling, testing, documentation, and version control, demonstrating how to efficiently extract, load, and transform data into a unified, analytics-ready format. Through practical examples, Thalia shows how to construct a robust data pipeline for a fictional ecommerce company and implement best practices in data engineering.
Ensure you have Python 3 installed. If not, download and install it from the official Python website.
-
Fork the Repository:
- Click the "Fork" button on the top right corner of this repository.
-
Clone the Repository:
-
git clone https://github.com/YOUR_USERNAME/end-to-end-data-engineering-project-4413618.git
- Replace
YOUR_USERNAME
with your GitHub username.
-
-
Navigate to the Directory:
-
cd end-to-end-data-engineering-project-4413618
-
-
Set Up a Virtual Environment:
- For Mac:
-
python3 -m venv venv source venv/bin/activate
-
- For Windows:
-
python -m venv venv .\venv\Scripts\activate
-
- For Mac:
-
Install Dependencies:
-
pip install -e ".[dev]"
-
Thalia Barrera