Pluto:Ultra-Fast Haplotype Phasing and Genotype Refinement Tool for Ultra-Low Coverage DNA sequence reads
We propose an efficient statistical method Pluto that enables haplotype phasing and genotype refinement from individual genomes sequenced in ultra-low coverage with orders of magnitude smaller computational cost compared to alternative methods.
Our method integrates the Positional Burros-Wheeler Transform (PBWT) and Variable Length Markov Chains (VLMC) to build a haplotype graph from reference haplotypes to account for genotype uncertainty. We leverage statistical methods, such as Kolmogorov-Smirnov (KS) test, to accurately and efficiently build haplotype graphs for VLMC from conditional distributions rapidly obtained from PBWT.
Our experiments show that our method can achieve comparable or better genotype accuracy to existing tools under various sequencing depth between 0.1x or 4x when using 1000 Genomes as reference haplotypes. At the same time, the computational speed of Pluto is more than 10x faster than Beagle when evaluated in 4x whole genome sequence reads.
- mkdir build
- cd build
- cmake ..
- make
- make test
Pluto index
Available Options
Shotgun Sequences : --refVCF [Empty]
Optional Files : --includeUnphasedIDs [], --includePhasedIDs [],
--excludeUnphasedIDs [], --excludePhasedIDs []
Graph Builder : --graphComplexity [1400], --PvalueMatrix [],
--calPvalueMatrix [], --geneticDistance [],
--seed [123456], --onlyHeterSite
Output Files : --outPrefix [mach1.out]
phase --refVcf reference.panel.vcf.gz --unphasedVcf target.vcf.gz --outPrefix target.phased
Pluto index --refVcf reference.panel.vcf.gz --PvalueMatrix /Users/fanzhang/Downloads/PlutoTest/PvalueMatrix
phase --refVcf reference.panel.vcf.gz --unphasedVcf target.vcf.gz --outPrefix target.phased
- Fork it!
- Create your feature branch:
git checkout -b my-new-feature - Commit your changes:
git commit -am 'Add some feature' - Push to the branch:
git push origin my-new-feature - Submit a pull request :D
This project is licensed under the terms of the MIT license.