NPBdetect is a neural network model which can be trained and used to predict natural product bioactivity from biosynthetic gene clusters (BGCs).The tool requires output files from antiSMASH (version 7). NPBDetect is able to predict 8 different bioactivities like antibacterial, antifungal and antitumor or cytotoxic, siderophore, antiprotozoal, inhibitor, antiviral, and surfactant.
NPBDetect is available via:
-
Detect biosynthetic gene clusters (BGCs) with antiSMASH 7
The first step is to generate
.gbk
files using antiSMASH 7. These GBK files are used by NPBDetect to predict bioactivities.-
Online Usage: You can use the antiSMASH web service here:
https://antismash.secondarymetabolites.org
-
Local Installation:
- Download antiSMASH:
https://antismash.secondarymetabolites.org/#!/download
- Follow installation instructions:
http://docs.antismash.secondarymetabolites.org/install/
- Download antiSMASH:
Example command to generate a
.gbk
file:antismash --output_dir <OUTPUT_FOLDER> --minlength 500 --cb-general --cb-knownclusters --cb-subclusters --fullhmmer --asf --pfam2go --smcog-trees -c 32 --genefinding-tool prodigal <INPUT_file>
-
-
Use NPBDetect
NPBDetect can be run in two ways:
- Online via Google Colab
- Locally via CLI (Conda environment) or Docker
-
Set up a Conda environment
conda create -n npbdetect python=3.10 conda activate npbdetect
-
Install required Python packages
pip install pandas pip install scikit-learn pip install biopython pip install torch torchaudio torchvision torchtext torchdata
-
Clone the NPBDetect repository
git clone https://github.com/cbl-nabi/NPBDetect cd NPBDetect
-
Validate installation and test sample prediction
python NPBDetect.py predict \ -v 1 \ --gbk test/BGC0000004.gbk \ --pred HC \ --out_dir outs/
NPBDetect is also available online through Google Colaboratory.
No installation required!
👉 Launch it here: NPBDetect@Google-Colab
You can also run NPBDetect easily using Docker. The image can be pulled
docker pull mantrilabnabi/npbdetect
Example command:
docker run --rm -it \
--volume <INPUT_DIR>:/data/input \
--volume <OUTPUT_DIR>:/data/output \
mantrilabnabi/npbdetect \
python NPBDetect.py predict \
-v 1 \
--gbk /data/input/BGC0000004.gbk \
--pred HC \
--out_dir /data/output/
Replace <INPUT_DIR>
and <OUTPUT_DIR>
with your local paths.
Note: change output_type = from "HC" to "ORG" for 8 bioactivity class predictions.
The prediction/output file will be generated as a csv file containing the probability and prediction values for all bioactivity classes.
If the probability value ≥ 0.5 → prediction = 1
If the probability value < 0.5 → prediction = 0.
The software is licensed under MIT License.
If you use NPBDetect in your research, please cite:
Hemant Goyat, Dalwinder Singh, Sunaina Paliyal, Shrikant Mantri,
Predicting biological activity from biosynthetic gene clusters using neural networks, 2024.