Skip to content

EpiGenomicsCode/ProteinDesign-Containers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Singularity Container for RFdiffusion

This repository contains scripts to build a Singularity container for the RFdiffusion protein structure generation model. This container is optimized for x86_64 systems with NVIDIA GPUs and has been tested on such systems.

Key Features:

  • RFdiffusion Support: Provides a containerized environment for running RFdiffusion.
    • x86_64 architecture support.
    • Model weights are automatically downloaded during the container build process.
    • The container includes all necessary dependencies to run RFdiffusion.

Important Note on RFdiffusion Usage:

  • While the container provides the core RFdiffusion software and its model weights, users should refer to the official RFdiffusion GitHub repository for detailed instructions on various design protocols, input file formats, and advanced configurations.

Prerequisites

  • Access to a system with Singularity (or Apptainer) installed.
  • Access to the internet to download dependencies and clone repositories during the build.
  • Access to a SLURM-managed cluster (if using the provided build_container.slurm script).
  • NVIDIA drivers compatible with CUDA 11.6.2 on the host system if GPU support is needed.

Model Weights Download

The RFdiffusion container automatically downloads the required model weights during the build process using the scripts/download_models.sh script from the official RFdiffusion repository. These weights are then moved to the RFdiffusion Python package path within the container, making them directly accessible to the software. No separate manual download or handling of model weights is typically required to use the container.

For specific model versions or advanced use cases, refer to the RFdiffusion documentation.

Files

This repository is structured as follows:

  • build_container.slurm: SLURM batch script located in the root directory to build the Singularity container. Requires user modification for cluster settings.
  • rfdiffusion/: Directory containing files specific to RFdiffusion.
    • rfdiffusion_x86.def: Singularity definition file for x86_64 systems.
    • run_rfdiffusion_launcher.py: Python script for convenient execution of the RFdiffusion container.

Pre-built Containers (Sylabs Cloud)

Pre-built Singularity image files (.sif) based on this definition are available on Sylabs Cloud. You can pull the x86_64 container directly using the following command:

# x86_64 architecture  
singularity pull --arch amd64 library://rfdiffusion/repo/rfdiffusion:amd64

Note: To verify the authenticity of the containers, please verify the key using:

singularity key import keys/mypublic.pem 

Building the Container

The build_container.slurm script (located in the repository root) can be used to build the RFdiffusion container on a SLURM-managed cluster.

Using the SLURM Script (build_container.slurm)

  1. Navigate to the Repository Root: Ensure your terminal is in the root directory of this repository where build_container.slurm is located.

  2. Configure SLURM Script: Before submitting, open build_container.slurm and ensure the following placeholder values are correctly set for your cluster environment:

    • #SBATCH --partition=YOUR_PARTITION: Set this to the appropriate SLURM partition/queue.
    • #SBATCH --account=YOUR_ACCOUNT: Set this to your SLURM allocation/account name.
    • (Optional) Review and adjust other SBATCH directives. The script uses variables like ${MODEL_DIR_NAME} which would be rfdiffusion and ${ARCH} which would be x86.
  3. Submit the SLURM Job: Use the sbatch command to submit the build job.

    sbatch build_container.slurm rfdiffusion x86 /path/to/your/build/directory

    Arguments:

    • rfdiffusion: The model's directory name.
    • x86: The target architecture.
    • /path/to/your/build/directory: The directory where the SIF file will be saved.

    Example:

    To build RFdiffusion for x86, outputting to /scratch/user/rfdiffusion_build:

    sbatch build_container.slurm rfdiffusion x86 /scratch/user/rfdiffusion_build

Troubleshooting

  • If you encounter CUDA errors, try updating your NVIDIA drivers on the host system to the latest version compatible with CUDA 11.6.x.
  • Some systems may need to use apptainer instead of singularity command.
  • If you continue experiencing CUDA version mismatches, rebuilding the container with a matching CUDA base image is recommended, though this may involve changing other dependencies like PyTorch and DGL versions in the %post section of the definition file.

Running the Container

Refer to the %runscript and %help (if added) sections within the rfdiffusion/rfdiffusion_x86.def file for low-level details. The runscript executes python3.9 scripts/run_inference.py \"$@\" from /app/RFdiffusion.

General Execution (Example):

singularity exec --nv /path/to/your/rfdiffusion_x86.sif \\
    inference.output_prefix=/outputs/test_design \\
    inference.input_pdb=/inputs/my_protein.pdb \\
    'contigmap.contigs=[100-100]' \\
    inference.num_designs=1

Remember to bind mount your input and output directories (e.g., using -B /host/inputs:/inputs,/host/outputs:/outputs).

Running RFdiffusion Predictions (using Launcher Script)

The rfdiffusion/run_rfdiffusion_launcher.py script provides a convenient way to run predictions using the RFdiffusion Singularity container.

  1. Prerequisites:

    • Python 3.x
    • spython and absl-py Python libraries: pip install spython absl-py
    • A built RFdiffusion Singularity container (e.g., rfdiffusion_x86.sif).
  2. Configuration:

    • Ensure the launcher script can find your SIF file. You can:
      1. Set the RFDIFFUSION_SIF environment variable: export RFDIFFUSION_SIF=/path/to/your/rfdiffusion_x86.sif
      2. Modify the _RFDIFFUSION_SIF_PATH variable directly in rfdiffusion/run_rfdiffusion_launcher.py.
      3. Use the --sif_path command-line argument when running the launcher.
  3. Execution: The launcher script accepts common RFdiffusion parameters as flags and passes any unrecognized arguments directly to the run_inference.py script inside the container.

    python rfdiffusion/run_rfdiffusion_launcher.py \\
        --input_pdb /path/to/input.pdb \\
        --output_prefix /path/to/output_dir/design_prefix \\
        --num_designs 5 \\
        "contigmap.contigs=[50-50/A1-20/50-50]" \\
        "ppi.hotspot_res=[A10,A12]" 

    Replace the example paths and RFdiffusion arguments with your actual settings. Input and output paths will be automatically bind-mounted into the container.

    Key Launcher Flags:

    • --sif_path: Path to the RFdiffusion Singularity image SIF file (overrides environment variable and hardcoded path).
    • --input_pdb: Path to the input PDB file (e.g., for motif scaffolding, binder design). Corresponds to inference.input_pdb.
    • --output_prefix: (Required) Prefix for output files. Corresponds to inference.output_prefix.
    • --model_directory_path: Path to custom model weights if not using the container's default. Corresponds to inference.model_directory_path.
    • --num_designs: Number of designs to generate. Corresponds to inference.num_designs.
    • --use_gpu: Set to true (default) or false to enable/disable NVIDIA GPU runtime.
    • --gpu_devices: Comma-separated list of GPU devices for NVIDIA_VISIBLE_DEVICES (e.g., '0,1').
    • --bind_mounts: List of additional paths to bind mount (e.g., "/host/data:/data_in_container").

    Passing RFdiffusion Specific Arguments: Any arguments not recognized by the launcher (like "contigmap.contigs=[...]", "potentials.guiding_potentials=[...]" in the example above) are passed directly to the run_inference.py script within the container. Refer to the RFdiffusion documentation for the full list of available inference arguments.

    Run python rfdiffusion/run_rfdiffusion_launcher.py --helpfull to see all launcher-specific options and how arguments are handled.

Acknowledgements

  • RFdiffusion by Baker Lab, University of Washington. RFdiffusion builds directly on the architecture and trained parameters of RoseTTAFold. We therefore thank Frank DiMaio and Minkyung Baek, who developed RoseTTAFold. For details, see the RFdiffusion paper and GitHub.
  • The research project is generously funded by Cornell University BRC Epigenomics Core Facility (RRID:SCR_021287), Penn State Institute for Computational and Data Sciences (RRID:SCR_025154) , Penn State University Center for Applications of Artificial Intelligence and Machine Learning to Industry Core Facility (AIMI) (RRID:SCR_022867) and supported by a gift to AIMI research from Dell Technologies.
  • Computational support was provided by NSF ACCESS to William KM Lai and Gretta Kellogg through BIO230041.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published