RNAdvisor is a wrapper tool for the computation of RNA 3D structural quality assessment. It uses docker compose to run the RNAdvisor tool in a containerized environment.
from rnadvisor.rnadvisor_cli import RNAdvisorCLI
rnadvisor_cli = RNAdvisorCLI(
pred_dir="data/example/PREDS",
native_path="data/example/NATIVE/R1107.pdb",
out_path="out.csv",
scores=["rmsd", "inf", "mcq", "lddt","tm-score", "gdt-ts", "ares", "pamnet"]
)
df_results, df_time = rnadvisor_cli.predict()
To install RNAdvisor v2 you need to have docker and docker-compose installed on your system. Then, you can install the package using pip:
pip install rnadvisor
Then you can compute the RNA 3D structural quality assessment using the command line interface (CLI) or the python API.
rnadvisor --pred_dir --scores [--native_path] [--out_path ]
[--out_time_path] [--sort_by] [--params] [--tmp_dir]
[--verbose] [--z_score] [--normalise]
with:
--pred_dir Directory to .pdb files or path to a .pdb file of the predictions.
--native_path Path to a .pdb file of the native structure.
--scores List of the scores to use, separated by a comma.
If you want to use them all, use `all`. To use all the metrics, use `metrics`
To use all the scoring functions, use `sf` (it does not include `rna-briq` as it is very
slow to compute).
Choice between clash,pamnet,lociparse,3drnascore,tb-mcq,barnaba,cgrnasp,dfire,mcq,
lcs,cad-score,tm-score,lddt,rasp,rs-rnasp,rmsd,inf,p-value,di,gdt-ts,ares,rna-briq.
--out_path Path to a .csv file where to save the predictions.
--out_time_path Path to a .csv file where to save the time of the predictions for each score.
--sort_by Metric to sort the results by.
--verbose Level of verbosity. 0 for no output, 1 for basic output, 2 for detailed output.
--params Hyperparameters of the different methods. It could be used to set the threshold for LCS-TA
or parameters of MCQ using `--params='{"mcq_threshold": 10, "mcq_mode": 2}'`. Values for `mcq_threshold` are 10, 15, 20 or 25 and values for
`mcq_mode` are 0 (relaxed), 1 (comparison without violations) or 2 (comparison of everything regardless violations).
--z_score Compute the Z-score for the computed scores. It reverses all the descreasing scores.
--normalise If the user doesn't want to normalise the .pdb files. It will run the --rna-puzzles-ready from RNA-tools.
--sort-by Metric to sort the results by. Choice between RMSD,P-VALUE,INF-ALL,INF-WC,INF-NWC,INF-STACK,DI,MCQ,TM-SCORE,GDT-TS,GDT-TS@1,GDT-TS@2,GDT-TS@4,GDT-TS@8,CAD,lDDT,RASP,BARNABA,DFIRE,rsRNASP.
This code implements 18 existing repositories and adds a python interface.
It takes as inputs a .pdb
file of predicted 3D structures (or a folder of .pdb
files) and a
.pdb
file of a native structure, and it returns a .csv
file with the different metrics.
It uses the following repositories:
- RNA_Assessment: a python repository that computes RMSD, P-VALUE, INF, and DI. I forked the project because I did some modifications, leading to use the following implementation of RNA_Assessment-forked
- MCQ4Structures : a java code that computes the MCQ score.
- Voronota: a C++ code that computes the CAD score.
- Zhanglab: a complete website to compute multiple scores, such as the GDT-TS or TM-score scores.
- BaRNAba: an implementation of the eRMSD and eSCORE. I created a fork version of BaRNAba-forked.
- DFIRE: an implementation of the DFIRE energy function.
- RASP: an implementation of the RASP energy function. I created a fork version of RASP-forked
- rsRNASP: a Python implementation of the rsRNASP score. I created a fork version of rsRNASP-forked with only the needed files.
- cgRNASP: a Python implementation of the cgRNASP score.
- OpenStructure: a C++ and Python implementation for structure analysis. It is used to compute TM-score and lDDT metrics.
- CGRNASP: a C implementation for the computation of CG-RNASP potentials. I created a fork version of CGRNASP-forked.
- TB-MCQ: a python implementation of the TB-MCQ score. It uses predicted torsional angles from a language-based model to compute the MCQ score with the inferred angles from a given structure.
- ARES: an implementation of ARES. I'm using a docker container derived from
adamczykb/ares_qa
that I have reduced. I also added inside thereduce
repository to add hydrogens to make ARES works. - PAMNet: official implementation of PAMNet. I have reduced the docker image to only keep the necessary files to run the scoring function.
- CLASH: RESTful web service client developed in Java that enables the computation of the CLASH score.
- LociPARSE: official python implementation of LociPARSE.
- RNA3DCNN: official python implementation of RNA3DCNN. I have reduced to a docker image that only works with GPU.
- 3dRNAScore: C++ official implementation of the 3dRNAScore.
- RNA-BRiQ: official C++ implementation of RNA-BRiQ. I got code from Thomasz Zok to extract the necessary files to run the scoring function. Please note that the building will not work as is requires to download a data file from the website. We advise to use the published docker image.
Note that all these repositories are implementing a lot of different functions. For the sake of this project, I just took what seemed to be the most relevant for the scoring of 3D structures.
Each of the scoring functions and metrics are isolated in individual docker containers.
You can find each of them in dockerhub with: sayby/rnadvisor-<name>-slim
or sayby/rnadvisor-<name>
.
<name>
can be one of the following:
Scoring Function | Metric |
---|---|
3drnascore |
rmsd |
lociparse |
inf |
tb-mcq |
p-value |
escore |
di |
pamnet |
mcq |
cgrnasp |
gdt-ts |
dfire |
lddt |
rasp |
tm-score |
rsRNASP |
cad-score |
ares |
clash |
rna-briq |
lcs |
The slim
version is a smaller version of the container that only contains the necessary codes to run the scoring function (e.g. no bash, no other commands, etc.).
It corresponds to the original image reduced with docker-slim
.
If you want to build yourself the docker images, you can do so by running the following command in the root directory of the repository:
just build-<name>-full
with <name>
being the name of the scoring function/metric you want to build.
To get the slim
version, you can run the following command:
just build-<name>-slim
If you want to run the different scoring function directly, you can run the following command:
docker run -it --rm -v sayby77/rnadvisor-<name>-slim
It will run the evaluation on the different examples.
If you want to run the evaluation on your own data, you can mount your data in the container using the -v
option.
For example, if you have a folder data/tmp/input
with your predictions and native structure, and you want to save the output in
data/tmp/output/out.csv
, you can mount the data/tmp
folder in the container using the following command:
docker run -it --rm -v ${PWD}/data/tmp/:/app/data/tmp/ sayby77/rnadvisor-<name>-slim --pred_dir data/tmp/input --native_path data/tmp/input/R1107.pdb --out_path data/tmp/output/out.csv
Please note that the rnadvisor
command line handles the mounting of the data for you, so you don't need to do it manually.
The structure of the repo is the following:
data
: examples of structures to be evaluated.dockerfiles
: each dockerfile for each individual scoring function/metric.img
: images used in the README file.licenses
: licenses of the different repositories used in the project.requirements
: requirements for the different docker images.src/rnadvisor
: the source code of the RNAdvisor tool, as well as for the wrapper for each metric/scoring function.tasks
: justfile tasks to build the docker images.tests
: tests for the different scoring functions/metrics.
Clement Bernard, Guillaume Postic, Sahar Ghannay, Fariza Tahi,
RNAdvisor: a comprehensive benchmarking tool for the measure and prediction of RNA structural model quality,
Briefings in Bioinformatics, Volume 25, Issue 2, March 2024, bbae064,
https://doi.org/10.1093/bib/bbae064