Data analysis pipeline for the Dengue Advanced Readiness Tools (DART) project.
The aim of this project is to develop a scalable and reproducible pipeline for the joint analysis of epidemiological, climate, and behavioural data to anticipate and predict dengue outbreaks.
Documentation | Contributing Guide
We use uv
to setup and manage Python versions and dependencies. uv
can be
installed using the following method:
curl -LsSf https://astral.sh/uv/install.sh | sh
If you do not have curl
installed, then use brew install curl
on
macOS or sudo apt install curl
on Debian/Ubuntu or sudo dnf install curl
on Fedora/RHEL.
If you have an existing virtual environment for DART pipeline, it should
be removed as uv
manages the .venv
folder itself. Once installed,
you can run dart-pipeline
as follows
git clone https://github.com/kraemer-lab/DART-Pipeline
uv sync
uv run dart-pipeline
Plots can be generated for metrics by running the uv run dart-pipeline plot
command, with the --format=png
parameter:
uv run dart-pipeline plot --format=png --size 6,9 ~/.local/share/dart-pipeline/output/VNM/worldpop/VNM-2-2020-worldpop.pop_count.parquet
Files can be previewed in the file manager. By default, dart-pipeline stores
outputs in ~/.local/share/dart-pipeline/output/ISO3
. The plot parameter
can also take wildcards to plot multiple data files at once.
There is a preview.sh
script supplied that allows you to
preview previously generated plots in the terminal with a fuzzy file search by
supplying an ISO3 code: ./bin/preview.sh ISO3
. The preview script requires a
few dependencies such as fzf
chafa
or imgcat
, an image viewer for the
terminal:
brew install chafa fzf # macOS
sudo apt install chafa fzf # Debian/Ubuntu
sudo dnf install chafa fzf # Fedora
# Windows (winget)
winget install -e --id junegunn.fzf
winget install -e --id hpjansson.Chafa
A sixel capable terminal is also required to preview plots in the terminal. Some terminals integrated into common editors support sixel and the iTerm image protocol, such as Visual Studio Code. Alternatively you can use the file manager to preview plots.
# if you have a sixel compatible terminal, you can directly see the plot in the terminal
uv run dart-pipeline plot ~/.local/share/dart-pipeline/output/VNM/worldpop/VNM-2-2020-worldpop.pop_count.parquet
# otherwise, pass the --format=png parameter to save the plot as a PNG file:
uv run dart-pipeline plot --format=png ~/.local/share/dart-pipeline/output/VNM/worldpop/VNM-2-2020-worldpop.pop_count.parquet
# We recommend trying a few plot sizes (with the --size width,height option) before running
uv run dart-pipeline plot --format=png --size 6,9 ~/.local/share/dart-pipeline/output/VNM/worldpop/VNM-2-2020-worldpop.pop_count.parquet
# Once the plot looks ok, we can generate all the plots
uv run dart-pipeline plot --format=png --size 6,9 ~/.local/share/dart-pipeline/output/VNM/**/*.parquet
# Preview generated plots
./bin/preview.sh VNM
Development requires the dev packages to be installed:
uv sync --all-extras
uv run pytest
The project uses pre-commit hooks, use
pre-commit install
to install hooks.
- OxRSE
- John Brittain
- Abhishek Dasgupta
- Rowan Nicholls
- Kraemer Group, Department of Biology
- Moritz Kraemer
- Prathyush Sambaturu
- Oxford e-Research Centre, Engineering Science
- Sarah Sparrow