Skip to content

Commit 3ee97d1

Browse files
authored
Update README.md
1 parent fa8957b commit 3ee97d1

File tree

1 file changed

+31
-8
lines changed

1 file changed

+31
-8
lines changed

README.md

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# DeepMetaPSICOV 1.0
1+
# DeepMetaPSICOV 1.0.0
22
### Deep residual neural networks for protein contact prediction
33

44
Shaun M. Kandathil, Joe G. Greener and David T. Jones
@@ -15,8 +15,8 @@ Requirements:
1515

1616
- Third-party programs:
1717
- HH-suite v3.0+ and a recent UniClust30 database (for making alignments; skip if you will only use pre-made alignments)
18-
- CCMpred v0.1.0
19-
- FreeContact 1.0.21
18+
- CCMpred v0.1.0 (Need this exact version; available [here](http://bioinfadmin.cs.ucl.ac.uk/downloads/ccmpred-0.1.0/CCMpred-0.1.0.tar.gz))
19+
- FreeContact 1.0.21 (available [here](https://rostlab.org/owiki/index.php/FreeContact))
2020
- Legacy BLAST 2.2.26 (executables `blastpgp` and `makemat`) and a suitable non-redundant database, e.g. Uniref90, formatted using `formatdb` (needed to generate PSIPRED and SOLVPRED inputs)
2121

2222
All other required programs written by our group are now bundled in this repo and do not need to be installed separately.
@@ -39,26 +39,49 @@ Setup and testing:
3939
### Specify paths to external dependencies
4040
Edit `run_DMP.sh` to indicate the paths to the third-party programs listed above, as well as other variables such as the number of threads to use for various programs. User-editable variables are demarcated by comment lines. We do not recommend changing anything outside this region unless you know what you are doing.
4141

42-
Testing TODO
42+
### Testing
43+
`cd test; ./testDMP.sh`
44+
45+
The script will use the configuration you have provided in `run_DMP.sh` and run a test contact prediction.
46+
*NB:* the test script will not run PSI-BLAST or HHBlits; the script runs only the remaining parts of the DMP pipeline (using running Option 4 below).
47+
48+
Different versions of OSs, compilers etc. can lead to differing contact scores (as well as outputs from the feature generation programs), so we only test the ranking of the top-L predicted contacts against a reference output.
4349

4450
Running:
4551
--------
4652
Run `/path/to/run_DMP.sh -h` to see the available options. DMP runs a number of programs to generate input features; their outputs are stored in a number of intermediate files. By default, DMP will attempt to reuse any files with the correct filenames (this is useful for debugging and allows you to 'continue' a failed run). You can force regeneration of intermediate files with the `--force` option.
4753

4854
At a minimum, you must provide the path to a FASTA-formatted target sequence in order to run DMP. There are a few different ways to run it:
4955

50-
### From sequence only (requires legacy BLAST and HHblits):
56+
### Option 1: From sequence only (requires legacy BLAST and HHblits):
5157
`/path/to/run_DMP.sh -i input.fasta`
5258

53-
### From sequence and pre-made alignment in PSICOV format (requires legacy BLAST only):
59+
### Option 2: From sequence and pre-made alignment in PSICOV format (requires legacy BLAST only):
5460
`/path/to/run_DMP.sh -i input.fasta -a input.aln`
5561

56-
### From sequence, and PSSM in legacy BLAST makemat format (requires HHblits only):
62+
### Option 3: From sequence, and PSSM in legacy BLAST makemat format (requires HHblits only):
5763
`/path/to/run_DMP.sh -i input.fasta -m input.mtx`
5864

59-
### From sequence, pre-made alignment, and PSSM in legacy BLAST makemat format (does not require BLAST or HHblits):
65+
### Option 4: From sequence, pre-made alignment, and PSSM in legacy BLAST makemat format (does not require BLAST or HHblits):
6066
`/path/to/run_DMP.sh -i input.fasta -a input.aln -m input.mtx`
6167

6268
Citing:
6369
-------
6470
If you find DeepMetaPSICOV useful, please cite our paper at bioRxiv: https://www.biorxiv.org/content/10.1101/586800v2
71+
72+
FAQs:
73+
-----
74+
### Do I have to use the exact versions of the programs that you mention?
75+
Yes. DMP was trained on data output by specific versions of the feature generation programs, and you need to use the same versions during inference.
76+
77+
### Your paper mentions a bug in one of the feature generation programs; is this release affected?
78+
The version of `alnstats` in this repository is not affected by the bug in question. Since we verified that the training of DMP did not suffer from the bug, we are releasing the bug-free version for inference. If for any reason you'd like the buggy version of `alnstats`, do get in touch.
79+
80+
### The version of CCMpred you use is ancient!
81+
We know. Keen users will also have spotted that we use a number of input features in common with MetaPSICOV. We wanted to assess whether we improve over MetaPSICOV, DeepCov etc. using exactly the same training data, where possible. We are currently working on the next version of DMP, which will use the latest version of CCMpred, among other changes.
82+
83+
### The version of PyTorch you use is ancient!
84+
The development of the DMP v1 began when PyTorch 0.3.0 was current.
85+
By the end of CASP13, v1.0 was being prepared for release.
86+
The DMP models were trained on v0.3.0 and although it is possible to read in and run the trained models using PyTorch v0.4-1.0+, we find that there are occasionally significant differences in the contact scores output.
87+
In order to keep things as close as possible to the version we ran in CASP13, we are recommending that users use PyTorch 0.3.0 or 0.3.1 for DMP v1.0. We intend to upgrade the version of PyTorch used in the next release.

0 commit comments

Comments
 (0)