Skip to content

getwilds/wilds-docker-library

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
wilds docker library logo

WILDS Docker Library

Curated collection of Docker images for reproducible bioinformatics workflows in the WILDS.

Project Status: Prototype – Useable, some support, open to feedback, unstable API. License: MIT

Overview

The WILDS Docker Library is a centralized repository of Docker images maintained by the Fred Hutch Data Science Lab (DaSL) for use in bioinformatics and computational workflows. This repository hosts Dockerfiles for a wide range of bioinformatics tools, ensuring consistent, reproducible, and secure environments for research workflows.

Each tool has its own directory with:

  • Dockerfiles for specific versions
  • Comprehensive documentation
  • Security vulnerability reports
  • Usage examples

Available Tools

The repository includes Docker images for popular bioinformatics tools including (but not limited to):

Tool Description Available Versions
AnnotSV Structural variant annotation 3.4.4, latest
ANNOVAR Functional annotation of genetic variants hg19, hg38, latest
ArcPy Python package for spatial analysis latest
AWS CLI Amazon Web Services command line interface 2.27.49, latest
BCFtools VCF/BCF manipulation 1.19, 1.11, latest
BEDtools Genome arithmetic toolkit 2.31.1, latest
biobambam2 Tools for SAM/BAM processing 2.0.185, latest
BWA Burrows-Wheeler Aligner 0.7.17, latest
Cell Ranger 10x Genomics analysis 6.0.2, latest
CNVkit Copy number variation detection 0.9.10, latest
combine-counts Tool for combining count matrices 0.1.0, latest
consensus Consensus sequence generation 0.1.1, latest
DELLY Structural variant discovery 1.2.9, latest
DESeq2 Differential gene expression analysis 1.40.2, latest
GATK Genome analysis toolkit 4.6.1.0, 4.3.0.0, latest
gtf-smash GTF file manipulation v8, latest
HISAT2 Graph-based alignment 2.2.1, latest
HMMcopy Copy number analysis 1.0.0, latest
ichorCNA Copy number alteration detection 0.2.0, latest
Manta Structural variant discovery 1.6.0, latest
Picard Sequence data manipulation 3.1.1, latest
python-dl Python deep learning environment 1.0, latest
RNA-SeQC RNA-seq quality control 2.4.2, latest
RTorch R interface to PyTorch 0.13.0, latest
Samtools SAM/BAM manipulation 1.19, 1.11, 1.10, latest
Scanpy Single-cell analysis 1.10.2, latest
scvi-tools Deep learning for single-cell 1.1.6, latest
ShapeMapper RNA structure mapping 2.3, latest
Smoove SV calling and genotyping 0.2.8, latest
sourmash k-mer analysis for genomic data 4.8.2, latest
SRA-tools NCBI Sequence Read Archive toolkit 3.1.1, latest
STAR RNA-seq aligner 2.7.6a, 2.7.4a, latest
Strelka Small variant calling 2.9.10, latest
UMI-tools UMI handling for NGS data 1.1.6, latest
VarScan Variant detection in massively parallel sequencing 2.4.6, latest

For a full list of available tools and versions, browse the repository directories.

Usage

Docker

# Pull from Docker Hub
docker pull getwilds/TOOLNAME:VERSION

# Pull from GitHub Container Registry
docker pull ghcr.io/getwilds/TOOLNAME:VERSION

# Example: Run a STAR alignment
docker run --rm -v /path/to/data:/data getwilds/star:latest STAR --runThreadN 4 --genomeDir /data/genome --readFilesIn /data/reads_1.fq /data/reads_2.fq --outFileNamePrefix /data/output/

Singularity/Apptainer

# Pull from Docker Hub
apptainer pull docker://getwilds/TOOLNAME:VERSION

# Pull from GitHub Container Registry
apptainer pull docker://ghcr.io/getwilds/TOOLNAME:VERSION

# Example: Run a STAR alignment
apptainer run --bind /path/to/data:/data docker://getwilds/star:latest STAR --runThreadN 4 --genomeDir /data/genome --readFilesIn /data/reads_1.fq /data/reads_2.fq --outFileNamePrefix /data/output/

Security and Vulnerability Monitoring

Automated Security Scanning

All Docker images in this repository undergo regular security scanning:

  1. Build-time scanning: Each image is automatically scanned for vulnerabilities when built
  2. Monthly scanning: A scheduled workflow scans all images on the first day of each month
  3. On-demand scanning: Repository maintainers can trigger scans for specific images at any time

Vulnerability Reports

Each tool directory contains vulnerability reports (CVEs_*.md) detailing:

  • Detected vulnerabilities with severity ratings
  • Affected components and versions
  • Available fixes or mitigations

If a specific tool's vulnerabilities are of particular concern, please file an issue in the GitHub repo citing which portion of the vulnerability report you would like to be addressed.

Automated Workflows

The repository uses GitHub Actions to automate several processes:

Docker Build and Publishing Workflow

Defined in .github/workflows/docker-update.yml, this workflow:

  • Builds and publishes Docker images when Dockerfiles are modified
  • Pushes images to both GitHub Container Registry and DockerHub
  • Updates repository descriptions based on README content
  • Generates vulnerability reports for each newly built image

Security Monitoring Workflow

Defined in .github/workflows/docker-scout.yml, this workflow:

  • Scans all Docker images monthly for security vulnerabilities using Docker Scout
  • Generates comprehensive vulnerability reports in each tool's respective subdirectory

Dockerfile Linting Workflow

Defined in .github/workflows/dockerfile-linting.yml, this workflow:

  • Automatically lints Dockerfiles for best practices and security issues
  • Ensures consistent formatting and optimization across all images

Contributing

We welcome contributions to improve and expand the WILDS Docker Library. Please review the following guidelines:

Directory Structure

Each tool should have its own directory with:

  • Dockerfile(s) following the naming convention Dockerfile_VERSION
  • A comprehensive README.md describing the tool, versions, and usage examples
  • Vulnerability reports generated by our workflows

Dockerfile Guidelines

  • Start from minimal base images (e.g., ubuntu, python, r-base)
  • Pin all software versions for reproducibility
  • Include all required labels (see template below)
  • Keep images small (few hundred MB, 2GB max)
  • Focus on a single tool per image (1-2 tools max)
  • Document all installed components

Required Labels

Each Dockerfile must include these labels:

LABEL org.opencontainers.image.title="toolname"
LABEL org.opencontainers.image.description="Short description of the tool and its purpose"
LABEL org.opencontainers.image.version="1.0"
LABEL org.opencontainers.image.authors="youremail@fredhutch.org"
LABEL org.opencontainers.image.url=https://hutchdatascience.org/
LABEL org.opencontainers.image.documentation=https://getwilds.org/
LABEL org.opencontainers.image.source=https://github.com/getwilds/wilds-docker-library
LABEL org.opencontainers.image.licenses=MIT

Testing Your Changes

Before submitting a PR:

  1. Test your Dockerfile locally:

    docker build -t test-image -f TOOLNAME/Dockerfile_VERSION .
  2. Verify functionality:

    docker run --rm test-image [command-to-test-functionality]
  3. Run a local security scan (if Docker Scout is available):

    docker scout cves test-image

Pull Request Process

  1. Fork the repository and create a feature branch
  2. Make changes following our guidelines
  3. Submit a pull request against the main branch
  4. Our CI/CD pipeline will automatically build test images and generate security reports
  5. Reviewers will provide feedback and approve changes

Support

For questions, bug reports, or feature requests:

License

Distributed under the MIT License. See LICENSE for details.

About

Collection of Docker images for use in WILDS bioinformatics workflows

Topics

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 6