-
Notifications
You must be signed in to change notification settings - Fork 7
Description
I have recently tested drprg out after upgrading make_prg from https://github.com/leoisl/make_prg/releases/tag/v0.3.0 to the latest v0.4.0
The results were almost the same across ~8500 samples. However, there were 3 that have gone from being TPs to FNs.
It has to do with a region in the gene gid where the positions have weird overlaps that differ between make_prg versions. This is all a bit easier if I put in some examples
Here is the region from https://github.com/leoisl/make_prg/releases/tag/v0.3.0 (this leads to a correct ALT call in POS 513)
gid 505 f3d1ef78 G T . PASS VC=SNP;GRAPHTYPE=NESTED;PDP=1,0 GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 0:12,0:7,0:12,0:7,0:25,0:15,0:0.5,1:-12.421,-127.498:115.077
gid 505 2de45418 GTCACGGGC TTGGGCGGCAGCGACGCTGC . PASS VC=COMPLEX;GRAPHTYPE=NESTED;PDP=0.722222,0.277778;VARID=gid_A138V,gid_R137W,gid_R137P,gid_A138T;PREDICT=F,F,F,F GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF .:8,3:5,2:0,0:0,0:25,23:15,13:0.666667,0.833333:-39.9668,-86.3427:46.3759
gid 513 aac746d5 C T . PASS VC=SNP;GRAPHTYPE=NESTED;PDP=0,1;VARID=gid_A138T,gid_A138V;PREDICT=.,R GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 1:0,23:0,13:0,23:0,13:0,23:0,13:1,0:-205.786,-7.87333:197.913
However, this is the same region with the latest v0.4.0 make_prg
gid 505 8b2d50ba G T . PASS VC=SNP;GRAPHTYPE=NESTED;PDP=1,0 GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 0:12,0:7,0:12,0:7,0:25,0:15,0:0.5,1:-12.421,-127.498:115.077
gid 505 ec5fd1a2 GTCACGGGC TTGGGCGGCAGCGACGCTGC . PASS VC=COMPLEX;GRAPHTYPE=NESTED;PDP=0.722222,0.277778;VARID=gid_A138V,gid_A138T,gid_R137W,gid_R137P;PREDICT=.,.,.,. GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 0:8,3:5,2:0,0:0,0:25,23:15,13:0.666667,0.833333:-39.9668,-86.3427:46.3759
gid 511 bad07f5f GGC GGT . frs VC=PH_SNPs;GRAPHTYPE=NESTED;PDP=0.342105,0.657895;VARID=gid_R137W,gid_A138V,gid_R137P,gid_A138T;PREDICT=.,.,.,. GT:MEAN_FWD_COVG:MEAN_REV_COVG:MED_FWD_COVG:MED_REV_COVG:SUM_FWD_COVG:SUM_REV_COVG:GAPS:LIKELIHOOD:GT_CONF 0:8,16:5,9:0,23:0,13:25,48:15,28:0.666667,0.333333:-132.07,-69.6442:62.4261
you can see the POS 511 variant is filtered due to FRS - there is coverage on the ref and alt, where the same variant (POS 513) in the previous version has no coverage on the ref.
The second variant in those examples overlaps both the first and third variants.
This 513C>T variant is discovered by denovo in both cases - i.e. the reference PRG is effectively just the first two variants
In both cases I am using pandora v0.10.0-alpha.0
The make_prg command is:
For the v0.3.0 version (you can find all the drprg/pandora/make_prg files at /hps/nobackup/iqbal/mbhall/drprg/tmp/predict-ERR2510499
)
make_prg from_msa -t 4 -L 5 -N 5 -v -o updated -i /hps/nobackup/iqbal/mbhall/drprg/tmp/predict-ERR2510499/update_msas
and for the v0.4.0 version (all files are at /hps/nobackup/iqbal/mbhall/drprg/paper/results/amr_predictions/drprg/illumina/PRJEB25972/SAMEA1101807/ERR2510499/
)
make_prg from_msa -t 2 -L 5 -N 5 -v --force -o updated -i /hps/nobackup/iqbal/mbhall/drprg/paper/results/amr_predictions/drprg/illumina/PRJEB25972/SAMEA1101807/ERR2510499/update_msas
Let me know if you need any other info @leoisl