Skip to content

This course teaches data professionals to build a robust data pipeline using modern open-source tools. It covers data modeling, testing, documentation, and version control, demonstrating best practices with a fictional e-commerce company.

License

Notifications You must be signed in to change notification settings

pranav-k-jha/End-to-End-Data-Engineering-Project-BigQuery

Repository files navigation

End-to-End Data Engineering Project

This repository contains the completed project for the LinkedIn Learning course "End-to-End Data Engineering Project". The course is available from LinkedIn Learning.

End-to-End Data Engineering Project

Overview

The world of data engineering is constantly evolving, with new tools and technologies emerging regularly. Building an effective analytics platform can be challenging, especially if you are unfamiliar with all the available tools. This course, led by Thalia Barrera, guides data professionals through the implementation of an end-to-end data engineering project using modern open-source tools. The course covers best practices in data modeling, testing, documentation, and version control, demonstrating how to efficiently extract, load, and transform data into a unified, analytics-ready format. Through practical examples, Thalia shows how to construct a robust data pipeline for a fictional ecommerce company and implement best practices in data engineering.

Prerequisites

Ensure you have Python 3 installed. If not, download and install it from the official Python website.

Installing

  1. Fork the Repository:

    • Click the "Fork" button on the top right corner of this repository.
  2. Clone the Repository:

    • git clone https://github.com/YOUR_USERNAME/end-to-end-data-engineering-project-4413618.git
    • Replace YOUR_USERNAME with your GitHub username.
  3. Navigate to the Directory:

    • cd end-to-end-data-engineering-project-4413618
  4. Set Up a Virtual Environment:

    • For Mac:
      • python3 -m venv venv
        source venv/bin/activate
    • For Windows:
      • python -m venv venv
        .\venv\Scripts\activate
  5. Install Dependencies:

    • pip install -e ".[dev]"

Instructor

Thalia Barrera

About

This course teaches data professionals to build a robust data pipeline using modern open-source tools. It covers data modeling, testing, documentation, and version control, demonstrating best practices with a fictional e-commerce company.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages