Skip to content

kraemer-lab/DART-Pipeline

Repository files navigation

DART-Pipeline

Python 3.11+ Tests Docker Documentation status Ruff

Data analysis pipeline for the Dengue Advanced Readiness Tools (DART) project.

The aim of this project is to develop a scalable and reproducible pipeline for the joint analysis of epidemiological, climate, and behavioural data to anticipate and predict dengue outbreaks.

Documentation | Contributing Guide

Setup

We use uv to setup and manage Python versions and dependencies. uv can be installed using the following method:

curl -LsSf https://astral.sh/uv/install.sh | sh

If you do not have curl installed, then use brew install curl on macOS or sudo apt install curl on Debian/Ubuntu or sudo dnf install curl on Fedora/RHEL.

If you have an existing virtual environment for DART pipeline, it should be removed as uv manages the .venv folder itself. Once installed, you can run dart-pipeline as follows

git clone https://github.com/kraemer-lab/DART-Pipeline
uv sync
uv run dart-pipeline

Previewing output

Plots can be generated for metrics by running the uv run dart-pipeline plot command, with the --format=png parameter:

uv run dart-pipeline plot --format=png --size 6,9 ~/.local/share/dart-pipeline/output/VNM/worldpop/VNM-2-2020-worldpop.pop_count.parquet

Files can be previewed in the file manager. By default, dart-pipeline stores outputs in ~/.local/share/dart-pipeline/output/ISO3. The plot parameter can also take wildcards to plot multiple data files at once.

Previewing plots in terminal [optional]

There is a preview.sh script supplied that allows you to preview previously generated plots in the terminal with a fuzzy file search by supplying an ISO3 code: ./bin/preview.sh ISO3. The preview script requires a few dependencies such as fzf chafa or imgcat, an image viewer for the terminal:

brew install chafa fzf       # macOS
sudo apt install chafa fzf   # Debian/Ubuntu
sudo dnf install chafa fzf   # Fedora

# Windows (winget)
winget install -e --id junegunn.fzf
winget install -e --id hpjansson.Chafa

A sixel capable terminal is also required to preview plots in the terminal. Some terminals integrated into common editors support sixel and the iTerm image protocol, such as Visual Studio Code. Alternatively you can use the file manager to preview plots.

# if you have a sixel compatible terminal, you can directly see the plot in the terminal
uv run dart-pipeline plot ~/.local/share/dart-pipeline/output/VNM/worldpop/VNM-2-2020-worldpop.pop_count.parquet
# otherwise, pass the --format=png parameter to save the plot as a PNG file:
uv run dart-pipeline plot --format=png ~/.local/share/dart-pipeline/output/VNM/worldpop/VNM-2-2020-worldpop.pop_count.parquet

# We recommend trying a few plot sizes (with the --size width,height option) before running
uv run dart-pipeline plot --format=png --size 6,9 ~/.local/share/dart-pipeline/output/VNM/worldpop/VNM-2-2020-worldpop.pop_count.parquet

# Once the plot looks ok, we can generate all the plots
uv run dart-pipeline plot --format=png --size 6,9 ~/.local/share/dart-pipeline/output/VNM/**/*.parquet

# Preview generated plots
./bin/preview.sh VNM

Development

Development requires the dev packages to be installed:

uv sync --all-extras
uv run pytest

The project uses pre-commit hooks, use pre-commit install to install hooks.

Authors and Acknowledgments

  • OxRSE
    • John Brittain
    • Abhishek Dasgupta
    • Rowan Nicholls
  • Kraemer Group, Department of Biology
    • Moritz Kraemer
    • Prathyush Sambaturu
  • Oxford e-Research Centre, Engineering Science
    • Sarah Sparrow

About

Data analysis pipeline for the Dengue Advanced Readiness Tools (DART) project

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages