Core | Docker Image |
---|---|
Repo to save our pyspark pipeline project. Created with the intention to put to life our studies in python, pyspark, ETL process and software engineering skills in general, such as units and integration test, CI, Docker and code quality.
git clonegit@github.com:joao-victor-campos/pyspark-data-aggregation-pipeline.git
cd pyspark-data-aggregation-pipeline
docker build -t pyspark-data-aggreagation-pipeline .
docker run pyspark-data-aggreagation-pipeline
make requirements
Apply code style with black and isort
make apply-style
Perform all checks (flake8, black and mypy)
make checks
Unit tests:
make unit-tests
Integration tests:
make integration-tests
All (unit + integration) tests with coverage report:
make tests-coverage