This repository contains scripts to build a Singularity container for the RFdiffusion protein structure generation model. This container is optimized for x86_64 systems with NVIDIA GPUs and has been tested on such systems.
Key Features:
- RFdiffusion Support: Provides a containerized environment for running RFdiffusion.
- x86_64 architecture support.
- Model weights are automatically downloaded during the container build process.
- The container includes all necessary dependencies to run RFdiffusion.
Important Note on RFdiffusion Usage:
- While the container provides the core RFdiffusion software and its model weights, users should refer to the official RFdiffusion GitHub repository for detailed instructions on various design protocols, input file formats, and advanced configurations.
- Access to a system with Singularity (or Apptainer) installed.
- Access to the internet to download dependencies and clone repositories during the build.
- Access to a SLURM-managed cluster (if using the provided
build_container.slurm
script). - NVIDIA drivers compatible with CUDA 11.6.2 on the host system if GPU support is needed.
The RFdiffusion container automatically downloads the required model weights during the build process using the scripts/download_models.sh
script from the official RFdiffusion repository. These weights are then moved to the RFdiffusion Python package path within the container, making them directly accessible to the software. No separate manual download or handling of model weights is typically required to use the container.
For specific model versions or advanced use cases, refer to the RFdiffusion documentation.
This repository is structured as follows:
build_container.slurm
: SLURM batch script located in the root directory to build the Singularity container. Requires user modification for cluster settings.rfdiffusion/
: Directory containing files specific to RFdiffusion.rfdiffusion_x86.def
: Singularity definition file for x86_64 systems.run_rfdiffusion_launcher.py
: Python script for convenient execution of the RFdiffusion container.
Pre-built Singularity image files (.sif
) based on this definition are available on Sylabs Cloud. You can pull the x86_64 container directly using the following command:
# x86_64 architecture
singularity pull --arch amd64 library://rfdiffusion/repo/rfdiffusion:amd64
Note: To verify the authenticity of the containers, please verify the key using:
singularity key import keys/mypublic.pem
The build_container.slurm
script (located in the repository root) can be used to build the RFdiffusion container on a SLURM-managed cluster.
-
Navigate to the Repository Root: Ensure your terminal is in the root directory of this repository where
build_container.slurm
is located. -
Configure SLURM Script: Before submitting, open
build_container.slurm
and ensure the following placeholder values are correctly set for your cluster environment:#SBATCH --partition=YOUR_PARTITION
: Set this to the appropriate SLURM partition/queue.#SBATCH --account=YOUR_ACCOUNT
: Set this to your SLURM allocation/account name.- (Optional) Review and adjust other SBATCH directives. The script uses variables like
${MODEL_DIR_NAME}
which would berfdiffusion
and${ARCH}
which would bex86
.
-
Submit the SLURM Job: Use the
sbatch
command to submit the build job.sbatch build_container.slurm rfdiffusion x86 /path/to/your/build/directory
Arguments:
rfdiffusion
: The model's directory name.x86
: The target architecture./path/to/your/build/directory
: The directory where the SIF file will be saved.
Example:
To build RFdiffusion for x86, outputting to
/scratch/user/rfdiffusion_build
:sbatch build_container.slurm rfdiffusion x86 /scratch/user/rfdiffusion_build
- If you encounter CUDA errors, try updating your NVIDIA drivers on the host system to the latest version compatible with CUDA 11.6.x.
- Some systems may need to use
apptainer
instead ofsingularity
command. - If you continue experiencing CUDA version mismatches, rebuilding the container with a matching CUDA base image is recommended, though this may involve changing other dependencies like PyTorch and DGL versions in the
%post
section of the definition file.
Refer to the %runscript
and %help
(if added) sections within the rfdiffusion/rfdiffusion_x86.def
file for low-level details. The runscript executes python3.9 scripts/run_inference.py \"$@\"
from /app/RFdiffusion
.
General Execution (Example):
singularity exec --nv /path/to/your/rfdiffusion_x86.sif \\
inference.output_prefix=/outputs/test_design \\
inference.input_pdb=/inputs/my_protein.pdb \\
'contigmap.contigs=[100-100]' \\
inference.num_designs=1
Remember to bind mount your input and output directories (e.g., using -B /host/inputs:/inputs,/host/outputs:/outputs
).
The rfdiffusion/run_rfdiffusion_launcher.py
script provides a convenient way to run predictions using the RFdiffusion Singularity container.
-
Prerequisites:
- Python 3.x
spython
andabsl-py
Python libraries:pip install spython absl-py
- A built RFdiffusion Singularity container (e.g.,
rfdiffusion_x86.sif
).
-
Configuration:
- Ensure the launcher script can find your SIF file. You can:
- Set the
RFDIFFUSION_SIF
environment variable:export RFDIFFUSION_SIF=/path/to/your/rfdiffusion_x86.sif
- Modify the
_RFDIFFUSION_SIF_PATH
variable directly inrfdiffusion/run_rfdiffusion_launcher.py
. - Use the
--sif_path
command-line argument when running the launcher.
- Set the
- Ensure the launcher script can find your SIF file. You can:
-
Execution: The launcher script accepts common RFdiffusion parameters as flags and passes any unrecognized arguments directly to the
run_inference.py
script inside the container.python rfdiffusion/run_rfdiffusion_launcher.py \\ --input_pdb /path/to/input.pdb \\ --output_prefix /path/to/output_dir/design_prefix \\ --num_designs 5 \\ "contigmap.contigs=[50-50/A1-20/50-50]" \\ "ppi.hotspot_res=[A10,A12]"
Replace the example paths and RFdiffusion arguments with your actual settings. Input and output paths will be automatically bind-mounted into the container.
Key Launcher Flags:
--sif_path
: Path to the RFdiffusion Singularity image SIF file (overrides environment variable and hardcoded path).--input_pdb
: Path to the input PDB file (e.g., for motif scaffolding, binder design). Corresponds toinference.input_pdb
.--output_prefix
: (Required) Prefix for output files. Corresponds toinference.output_prefix
.--model_directory_path
: Path to custom model weights if not using the container's default. Corresponds toinference.model_directory_path
.--num_designs
: Number of designs to generate. Corresponds toinference.num_designs
.--use_gpu
: Set totrue
(default) orfalse
to enable/disable NVIDIA GPU runtime.--gpu_devices
: Comma-separated list of GPU devices forNVIDIA_VISIBLE_DEVICES
(e.g., '0,1').--bind_mounts
: List of additional paths to bind mount (e.g.,"/host/data:/data_in_container"
).
Passing RFdiffusion Specific Arguments: Any arguments not recognized by the launcher (like
"contigmap.contigs=[...]"
,"potentials.guiding_potentials=[...]"
in the example above) are passed directly to therun_inference.py
script within the container. Refer to the RFdiffusion documentation for the full list of available inference arguments.Run
python rfdiffusion/run_rfdiffusion_launcher.py --helpfull
to see all launcher-specific options and how arguments are handled.
- RFdiffusion by Baker Lab, University of Washington. RFdiffusion builds directly on the architecture and trained parameters of RoseTTAFold. We therefore thank Frank DiMaio and Minkyung Baek, who developed RoseTTAFold. For details, see the RFdiffusion paper and GitHub.
- The research project is generously funded by Cornell University BRC Epigenomics Core Facility (RRID:SCR_021287), Penn State Institute for Computational and Data Sciences (RRID:SCR_025154) , Penn State University Center for Applications of Artificial Intelligence and Machine Learning to Industry Core Facility (AIMI) (RRID:SCR_022867) and supported by a gift to AIMI research from Dell Technologies.
- Computational support was provided by NSF ACCESS to William KM Lai and Gretta Kellogg through BIO230041.