Pyspark Data Aggregation Pipeline

Core	Docker Image

Introduction

Repo to save our pyspark pipeline project. Created with the intention to put to life our studies in python, pyspark, ETL process and software engineering skills in general, such as units and integration test, CI, Docker and code quality.

Running the pipeline

Clone the project:

git clonegit@github.com:joao-victor-campos/pyspark-data-aggregation-pipeline.git
cd pyspark-data-aggregation-pipeline

Build docker image

docker build -t pyspark-data-aggreagation-pipeline .

Run app in docker

docker run pyspark-data-aggreagation-pipeline

Pipeline execution finished!!!

Development

Install dependencies

make requirements

Code Style

Apply code style with black and isort

make apply-style

Perform all checks (flake8, black and mypy)

make checks

Testing and Coverage

Unit tests:

make unit-tests

Integration tests:

make integration-tests

All (unit + integration) tests with coverage report:

make tests-coverage

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github		.github
data/input		data/input
scripts		scripts
spark_pipeline		spark_pipeline
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
requirements.dev.txt		requirements.dev.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Pyspark Data Aggregation Pipeline

Introduction

Running the pipeline

Clone the project:

Build docker image

Run app in docker

Pipeline execution finished!!!

Development

Install dependencies

Code Style

Testing and Coverage

About

Uh oh!

Releases

Packages

Contributors 4

Uh oh!

Languages

License

joao-victor-campos/pyspark-data-aggregation-pipeline

Folders and files

Latest commit

History

Repository files navigation

Pyspark Data Aggregation Pipeline

Introduction

Running the pipeline

Clone the project:

Build docker image

Run app in docker

Pipeline execution finished!!!

Development

Install dependencies

Code Style

Testing and Coverage

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages