PTM Prediction using Modified Casanovo

Adapts Casanovo for de novo sequencing with prediction of specific Post-Translational Modifications (PTMs).

Note: Identifies only known, predefined PTMs listed in config.yaml. Cannot perform open modification searches.

Key Modifications

Predicts PTMs defined in config.yaml using a fixed vocabulary (each AA+PTM is a unique token).
Includes input annotation standardization (standardize_sequence.py) for robust training.
Generates detailed mzTab output with PTM locations, scores, and total mass shifts.
Built upon the Casanovo (depthcharge) Transformer architecture.

Approach

PTMs are handled by defining all target AA+PTM combinations (e.g., C+57.02146) with their masses in the config.yaml residues section. This creates a fixed model vocabulary. The standardize_sequence.py utility maps input annotations to these exact tokens for training. The model predicts sequences using these tokens. The MztabWriter then parses these predicted tokens to generate mzTab PTM information, including calculated total mass shifts.

Limitations

No Open Modification Search: Only identifies PTMs explicitly defined in config.yaml.
Vocabulary Size: Large numbers of defined PTMs increase vocabulary size and potentially training difficulty.

Installation

Clone the repository:

git clone <your-repository-url>
cd <repository-directory>

Install dependencies: (Assuming you have a requirements.txt file)
```
pip install -r requirements.txt
```
Change the run.sh script for desired function:
```
 bash run.sh
```

Configuration

Modify the residues section to include all standard amino acids and the specific AA+PTM combinations you want the model to be able to predict. Ensure the masses are accurate.

You can copy the default configuration and modify it:

cp src/config.yaml my_custom_config.yaml
# Now edit my_custom_config.yaml

Specify your custom config file using the -c or --config option when running commands.

Usage

The main entry point is src/denovo_ptm/denovo.py.

# Show help message and commands
python -m src.denovo_ptm.denovo --help

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.ipynb_checkpoints		.ipynb_checkpoints
src		src
.gitignore		.gitignore
README.md		README.md
calculate_total_ptm.py		calculate_total_ptm.py
checkpoint_check.py		checkpoint_check.py
eval_results.log		eval_results.log
main.py		main.py
run.sh		run.sh
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PTM Prediction using Modified Casanovo

Key Modifications

Approach

Limitations

Installation

Configuration

Usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

yeti-teti/Microbe

Folders and files

Latest commit

History

Repository files navigation

PTM Prediction using Modified Casanovo

Key Modifications

Approach

Limitations

Installation

Configuration

Usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages