Skip to content

oxpig/AntiDIF

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AntiDIF

Code for AntiDIF Antibody-specific discrete Diffusion for Inverse Folding

From our paper Accurate and Diverse Antibody Specific Inverse Folding with Discrete Diffusion.

Paper link: https://www.biorxiv.org/content/10.1101/2025.07.12.664553v1

AntiDIF is built on top of RL-DIF https://github.com/flagshippioneering/pi-rldif

See tutorial.ipynb for a minimal example walkthrough

------- More detailed install tutorial coming soon -------

Quick start

Install

Clone the repo and create an environment (See requirements in RL-DIF, https://github.com/flagshippioneering/pi-rldif)

Inference on example data

In the pi-rldif-ft directory simply run:

python -m run.run

This will run inference for the example antibody 7yxu.pdb

Troubleshooting

Absolute paths to m_name and example.csv in config.yaml may need to be set. As well as setting the absolute path to 7yxu.pdb in example.csv.

Inference with custom inputs and config

config.yaml is where we can control what data to run inference on and more! For a full breakdown of the parameters in the config see the RL-DIF readme. Here we focus on the parameters needed for inference with custom inputs. Note, you should only have to change the first two parameters listed below. See tutorial.ipynb for more of a walkthrough.

  • custom_pdb_input (str): Path to CSV file that points to PDB files to run inference for. It is recommended to give absolute paths.

  • m_name (str): Path to AntiDIF model checkpoint, it is recommend to change this to the absolute path the ckpt is in on your machine.

  • load_pkl (str) Option to load data from pkl file (default null and data is loaded from CSV) Path to pkl file of RLDIFDataset, or null for CSV loading.

  • protein_mpnn (bool):
    Ensure set to False to run AntiDIF

  • rldif (bool):
    Ensure set to True, this loads the rldif architecture that the weights for AntiDIF are then loaded into

  • esmif (bool):
    Ensure set to False to run AntiDIF

  • dif_large (bool):
    Ensure set to False to run AntiDIF

  • inference (bool) Ensure this is set to True for inference

Note setting dif_large=True or not providing a m_name will load the base RLDIF weights.

Creating a csv for running inference with custom inputs

To run inference on custom inputs, a CSV file that gives the PDB paths is required. An example of what this CSV file looks like is given by data/raw_data/sab_test_hpc.csv. And data/raw_data/example.csv. Note that the PDB files the CSV points to only have the heavy and light chain of the antibodies thus, the 2nd row (that tells the model what chains to process) for the PDB IDs in the CSV is ‘all‘, indicating that the model will consider all chains in the PDB file. It is recommended that your PDB files be formatted in the same way. As an example, see data/raw_data/7yxu.pdb.

Once you have created your custom CSV set the path to it in config.yaml under custom_pdb_input. Then for inference, from the pi-rldif-ft folder run:

python -m run.run

Quick start option 2

Optionally, you can also more closely follow RL-DIF instructions but with AntiDIF weights. model.ckpt gives the model weights for AntiDIF, once this is downloaded the instructions from the RL-DIF github can be followed for inference by simply swapping the RL-DIF checkpoint for model.ckpt in run.py (in this file state_dict should load in the AntiDIF ckpt).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published