A Nextflow plugin for integrating NVIDIA NIMs (NVIDIA Inference Microservices) as custom executors for bioinformatics workflows.
This plugin provides a generic nim
executor that can run NVIDIA NIM services for biological computing, specifically:
- RFDiffusion - Protein structure generation and design
Add the plugin to your nextflow.config
:
plugins {
id 'nf-nim'
}
Or use it directly in your pipeline script:
plugins {
id 'nf-nim@0.1.0'
}
The plugin uses NVIDIA's hosted API endpoints by default. You will need to set up authentication with NVIDIA API keys as described in the NVIDIA NIM documentation.
Set your API key as an environment variable:
export NVCF_RUN_KEY="your-nvidia-api-key-here"
- RFDiffusion:
https://health.api.nvidia.com/v1/biology/ipd/rfdiffusion/generate
You can configure custom NIM endpoints in your nextflow.config
:
nim {
rfdiffusion {
endpoint = 'http://your-nim-server:8080/biology/ipd/rfdiffusion/generate'
}
}
Use the nim
executor in your processes and specify which NIM service to use with task.ext.nim
:
process myNIMProcess {
executor 'nim'
input:
// your inputs
output:
// your outputs
script:
task.ext.nim = "rfdiffusion"
"""
# Your script here - the NIM executor handles the actual API calls
echo "Running ${task.ext.nim} analysis"
"""
}
process rfdiffusionDesign {
executor 'nim'
input:
path pdb_file
output:
path "output.pdb"
script:
task.ext.nim = "rfdiffusion"
"""
echo "Designing protein structure using RFDiffusion"
"""
}
Parameters for RFDiffusion can be set in params
:
params.contigs = "A20-60/0 50-100"
params.hotspot_res = ["A50","A51","A52","A53","A54"]
params.diffusion_steps = 15
#!/usr/bin/env nextflow
params.pdb_file = "input.pdb"
workflow {
// Structure-based design with RFDiffusion
if (params.pdb_file) {
designProtein(file(params.pdb_file))
}
}
process designProtein {
executor 'nim'
input:
path pdb_file
output:
path "designed.pdb"
script:
task.ext.nim = "rfdiffusion"
"""
echo "Designing protein based on ${pdb_file}"
"""
}
- Input: PDB file containing protein structure
- Parameters:
params.contigs
- Contigs specification (default: "A20-60/0 50-100")params.hotspot_res
- Hotspot residues (default: ["A50","A51","A52","A53","A54"])params.diffusion_steps
- Number of diffusion steps (default: 15)
Test NIM service availability:
# RFDiffusion
curl -v -H "Authorization: Bearer $NVCF_RUN_KEY" \
https://health.api.nvidia.com/v1/biology/ipd/rfdiffusion/generate
You can test the RFDiffusion endpoint with a sample PDB structure:
# Download a sample PDB file and test the API
curl -s https://files.rcsb.org/download/1R42.pdb | \
grep '^ATOM' | head -n 400 | \
awk '{printf "%s\\n", $0}' > sample_pdb.txt
# Test the API call
curl -X POST \
-H "Authorization: Bearer $NVCF_RUN_KEY" \
-H "Content-Type: application/json" \
-d '{
"input_pdb": "'$(cat sample_pdb.txt | sed 's/\\n/\n/g')'",
"contigs": "A20-60/0 50-100",
"hotspot_res": ["A50", "A51", "A52", "A53", "A54"],
"diffusion_steps": 15
}' \
https://health.api.nvidia.com/v1/biology/ipd/rfdiffusion/generate
The NIM plugin consists of several key components:
NIMExecutor
- Main executor that manages NIM API endpoints and creates task handlersNIMTaskHandler
- Handles individual task execution, including PDB processing and API communicationNIMIntegrationTest
- End-to-end integration tests that validate real API functionality
The NIMTaskHandler
uses a modular approach that separates concerns:
- PDB Data Processing - Downloads and processes PDB files from RCSB
- API Communication - Makes HTTP requests to NVIDIA NIM endpoints
- Result Handling - Processes API responses and saves results
For programmatic use, you can:
// Set PDB data directly (recommended for testing)
handler.setPdbData(pdbData)
handler.submit()
// Or use legacy method (downloads PDB automatically)
handler.submit() // Downloads PDB from RCSB if no data is set
The test suite follows a 3-step integration testing pattern:
- Download PDB file - Using test utility methods
- Pass data to TaskHandler - Via
setPdbData()
method - Verify API completion - Check response and result files
This separation allows for:
- Unit testing - Mock components independently
- Integration testing - Test with real API endpoints
- Isolated testing - Test individual components without external dependencies
make assemble
The test suite includes multiple levels of testing:
# Run all tests
make test
# Run specific test classes
./gradlew test --tests "*NIMExecutorTest*"
./gradlew test --tests "*NIMTaskHandlerTest*"
./gradlew test --tests "*NIMIntegrationTest*"
- Unit tests - No external dependencies required
- Integration tests - Require
NVCF_RUN_KEY
environment variable for real API testing
NIMExecutorTest
- Tests executor initialization and configurationNIMTaskHandlerTest
- Tests task handler lifecycle and error handlingNIMIntegrationTest
- End-to-end tests with real NVIDIA API calls
make install
When contributing to the plugin:
- Separate concerns - Keep PDB processing, API calls, and result handling modular
- Test thoroughly - Add unit tests for new functionality
- Document changes - Update README and inline documentation
- Follow patterns - Use the established 3-step testing pattern for integration tests
This project is licensed under the Apache License 2.0 - see the COPYING file for details.
Contributions are welcome! Please see the development guidelines in the source code for more information.