Skip to content

BioInfoTools/biscuit

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BISCUIT Travis-CI Build Status DOI

BISulfite-seq CUI Toolkit (BISCUIT) is a utility suite for analyzing sodium bisulfite conversion-based DNA methylation/modification data. It was written to perform alignment, DNA methylation and mutation calling, allele specific methylation from bisulfite sequencing data.

Download and Install

Latest release is here. To install BISCUIT,

$ unzip release.zip
$ cd biscuit-release
$ make

The created biscuit binary is the main entry point.

All releases are available here. Note after v0.2.0, make sure use git clone --recursive to get the submodules.

Usage

Index reference for alignment

biscuit index GRCh38.fa

The index of BISCUIT composed of the 2-bit packed reference (.bis.pac, .bis.amb, .bis.ann). The suffix array and FM-index of the parent strand (.par.bwt and .par.sa) and the daughter strand (.dau.bwt and .dau.sa).

Read alignment

The following snippet shows how BISCUIT can be used in conjunction with samtools to produce indexed alignment BAM file.

$ biscuit align -t 10 GRCh38.fa fastq1.fq.gz fastq2.fq.gz | samtools sort -T . -O bam -o output.bam
$ samtools index output.bam
$ samtools flagstat output.bam >output.bam.flagstat

Mark duplicate reads

This step is optional. The mark duplicate of BISCUIT is bisulfite strand aware.

$ biscuit markdup input.bam output.bam

Pileup

Like samtools, BISCUIT extract DNA methylation as well as genetic information. The following shows how to produce a tabix-indexed VCF file.

$ biscuit pileup -r GRCh38.fa -i input.bam -o output.vcf -q 20
$ bgzip output.vcf
$ tabix -p vcf output.vcf.gz

Make bed files

The following extract CpG beta values from the VCF file.

$ biscuit vcf2bed -k 10 -t cg input.vcf.gz

-t can also take

  • snp - SNP information
  • c - all cytosines
  • hcg - HCG for NOMe-seq
  • gch - GCH for NOMe-seq

EPI-reads and allele-specific methylation

Following illustrates how to produce epiread which carries the information of epi-haplotype.

$ biscuit epiread -r GRCh38.fa -i input.bam -B snp.bed

To test all SNP-CpG pair,

$ biscuit epiread -r GRCh38.fa -P -i input.bam -B snp.bed

Details can be found here.

sort -k1,1 -k2,2n -k3,3n in.epiread >out.epiread
biscuit asm out.epiread >out.asm

Acknowledgements

  • lib/aln was adapted from Heng Li's BWA-mem code.
  • lib/htslib was subtree-ed from the htslib library.
  • lib/klib was subtree-ed from Heng Li's klib.

About

BISuilfite-seq CUI Toolkit

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • C 95.3%
  • Python 2.5%
  • C++ 1.5%
  • Makefile 0.7%