![]() |
Curated collection of Docker images for reproducible bioinformatics workflows in the WILDS. |
The WILDS Docker Library is a centralized repository of Docker images maintained by the Fred Hutch Data Science Lab (DaSL) for use in bioinformatics and computational workflows. This repository hosts Dockerfiles for a wide range of bioinformatics tools, ensuring consistent, reproducible, and secure environments for research workflows.
Each tool has its own directory with:
- Dockerfiles for specific versions
- Comprehensive documentation
- Security vulnerability reports
- Usage examples
The repository includes Docker images for popular bioinformatics tools including (but not limited to):
Tool | Description | Available Versions |
---|---|---|
AnnotSV | Structural variant annotation | 3.4.4, latest |
ANNOVAR | Functional annotation of genetic variants | hg19, hg38, latest |
ArcPy | Python package for spatial analysis | latest |
AWS CLI | Amazon Web Services command line interface | 2.27.49, latest |
BCFtools | VCF/BCF manipulation | 1.19, 1.11, latest |
BEDtools | Genome arithmetic toolkit | 2.31.1, latest |
biobambam2 | Tools for SAM/BAM processing | 2.0.185, latest |
BWA | Burrows-Wheeler Aligner | 0.7.17, latest |
Cell Ranger | 10x Genomics analysis | 6.0.2, latest |
CNVkit | Copy number variation detection | 0.9.10, latest |
combine-counts | Tool for combining count matrices | 0.1.0, latest |
consensus | Consensus sequence generation | 0.1.1, latest |
DELLY | Structural variant discovery | 1.2.9, latest |
DESeq2 | Differential gene expression analysis | 1.40.2, latest |
GATK | Genome analysis toolkit | 4.6.1.0, 4.3.0.0, latest |
gtf-smash | GTF file manipulation | v8, latest |
HISAT2 | Graph-based alignment | 2.2.1, latest |
HMMcopy | Copy number analysis | 1.0.0, latest |
ichorCNA | Copy number alteration detection | 0.2.0, latest |
Manta | Structural variant discovery | 1.6.0, latest |
Picard | Sequence data manipulation | 3.1.1, latest |
python-dl | Python deep learning environment | 1.0, latest |
RNA-SeQC | RNA-seq quality control | 2.4.2, latest |
RTorch | R interface to PyTorch | 0.13.0, latest |
Samtools | SAM/BAM manipulation | 1.19, 1.11, 1.10, latest |
Scanpy | Single-cell analysis | 1.10.2, latest |
scvi-tools | Deep learning for single-cell | 1.1.6, latest |
ShapeMapper | RNA structure mapping | 2.3, latest |
Smoove | SV calling and genotyping | 0.2.8, latest |
sourmash | k-mer analysis for genomic data | 4.8.2, latest |
SRA-tools | NCBI Sequence Read Archive toolkit | 3.1.1, latest |
STAR | RNA-seq aligner | 2.7.6a, 2.7.4a, latest |
Strelka | Small variant calling | 2.9.10, latest |
UMI-tools | UMI handling for NGS data | 1.1.6, latest |
VarScan | Variant detection in massively parallel sequencing | 2.4.6, latest |
For a full list of available tools and versions, browse the repository directories.
# Pull from Docker Hub
docker pull getwilds/TOOLNAME:VERSION
# Pull from GitHub Container Registry
docker pull ghcr.io/getwilds/TOOLNAME:VERSION
# Example: Run a STAR alignment
docker run --rm -v /path/to/data:/data getwilds/star:latest STAR --runThreadN 4 --genomeDir /data/genome --readFilesIn /data/reads_1.fq /data/reads_2.fq --outFileNamePrefix /data/output/
# Pull from Docker Hub
apptainer pull docker://getwilds/TOOLNAME:VERSION
# Pull from GitHub Container Registry
apptainer pull docker://ghcr.io/getwilds/TOOLNAME:VERSION
# Example: Run a STAR alignment
apptainer run --bind /path/to/data:/data docker://getwilds/star:latest STAR --runThreadN 4 --genomeDir /data/genome --readFilesIn /data/reads_1.fq /data/reads_2.fq --outFileNamePrefix /data/output/
All Docker images in this repository undergo regular security scanning:
- Build-time scanning: Each image is automatically scanned for vulnerabilities when built
- Monthly scanning: A scheduled workflow scans all images on the first day of each month
- On-demand scanning: Repository maintainers can trigger scans for specific images at any time
Each tool directory contains vulnerability reports (CVEs_*.md
) detailing:
- Detected vulnerabilities with severity ratings
- Affected components and versions
- Available fixes or mitigations
If a specific tool's vulnerabilities are of particular concern, please file an issue in the GitHub repo citing which portion of the vulnerability report you would like to be addressed.
The repository uses GitHub Actions to automate several processes:
Defined in .github/workflows/docker-update.yml
, this workflow:
- Builds and publishes Docker images when Dockerfiles are modified
- Pushes images to both GitHub Container Registry and DockerHub
- Updates repository descriptions based on README content
- Generates vulnerability reports for each newly built image
Defined in .github/workflows/docker-scout.yml
, this workflow:
- Scans all Docker images monthly for security vulnerabilities using Docker Scout
- Generates comprehensive vulnerability reports in each tool's respective subdirectory
Defined in .github/workflows/dockerfile-linting.yml
, this workflow:
- Automatically lints Dockerfiles for best practices and security issues
- Ensures consistent formatting and optimization across all images
We welcome contributions to improve and expand the WILDS Docker Library. Please review the following guidelines:
Each tool should have its own directory with:
- Dockerfile(s) following the naming convention
Dockerfile_VERSION
- A comprehensive
README.md
describing the tool, versions, and usage examples - Vulnerability reports generated by our workflows
- Start from minimal base images (e.g.,
ubuntu
,python
,r-base
) - Pin all software versions for reproducibility
- Include all required labels (see template below)
- Keep images small (few hundred MB, 2GB max)
- Focus on a single tool per image (1-2 tools max)
- Document all installed components
Each Dockerfile must include these labels:
LABEL org.opencontainers.image.title="toolname"
LABEL org.opencontainers.image.description="Short description of the tool and its purpose"
LABEL org.opencontainers.image.version="1.0"
LABEL org.opencontainers.image.authors="youremail@fredhutch.org"
LABEL org.opencontainers.image.url=https://hutchdatascience.org/
LABEL org.opencontainers.image.documentation=https://getwilds.org/
LABEL org.opencontainers.image.source=https://github.com/getwilds/wilds-docker-library
LABEL org.opencontainers.image.licenses=MIT
Before submitting a PR:
-
Test your Dockerfile locally:
docker build -t test-image -f TOOLNAME/Dockerfile_VERSION .
-
Verify functionality:
docker run --rm test-image [command-to-test-functionality]
-
Run a local security scan (if Docker Scout is available):
docker scout cves test-image
- Fork the repository and create a feature branch
- Make changes following our guidelines
- Submit a pull request against the main branch
- Our CI/CD pipeline will automatically build test images and generate security reports
- Reviewers will provide feedback and approve changes
For questions, bug reports, or feature requests:
- Open an issue
- Email the Fred Hutch Data Science Lab at wilds@fredhutch.org
Distributed under the MIT License. See LICENSE
for details.