This project contains all code related to the pre-processing and ETL pipeline for getting data from eCRF systems to harmonized PRIME-ROSE variables and finally to the OMOP CDM.
This project is in the early stages of development. It uses Poetry for dependency management and version control, and includes automated formatting and linting with black and flake8 via pre-commit hooks.
git clone git@github.com:pcm-primerose/omop_etl.git
cd omop_etlUse UV to run and manage project dependancies:
uv run <command>For example:
uv run python3 pytest # Run the tests
uv run pre-commit run --all-files # Run pre-commit hooks
uv run python3 some_file.py # Run a specific file Follow these steps to contribute:
git checkout -b your-branch-nameEnsure your code works well and is easy to read (ideally write tests and make it clean, modular, extensible). Always use type-hints!
Run pre-commit hooks locally to validate:
poetry run pre-commit run --all-filesSubmit a Pull Request:
Once your changes are complete, push your branch and open a pull request:
git push origin your-branch-nameThis project is in its early stages; additional documentation will be added as development progresses.