-
Notifications
You must be signed in to change notification settings - Fork 4
Data Format
GenomeFlow generates a text file in the medium file format.
A whitespace separated file that contains, on each line:
<readname> <str1> <chr1> <pos1> <frag1> <str2> <chr2> <pos2> <frag2> <mapq1> <mapq2>
- str = strand (0 for forward, anything else for reverse)
- chr = chromosome (must be a chromosome in the genome)
- pos = position
- frag = restriction site fragment
- mapq = mapping quality score
If not using the restriction site file option, frag will be ignored, but please see above note on dummy values. If not using mapping quality filter, mapq will be ignored. readname and strand are also not currently stored within .hic files.
- Bowtie2: http://sysbio.rnet.missouri.edu/bdm_download/GenomeFlow/GM06990/GenomeFlow_formatted.bowtie2.input
- Bwa: http://sysbio.rnet.missouri.edu/bdm_download/GenomeFlow/GM06990/GenomeFlow_formatted.bwa.input
More details about other file formats can be found here
The .hic file is a binary file containing compressed contact matrices at many resolutions, facilitating visualization and analysis at multiple scales. The .hic file format is described extensively in Durand and Shamim et al., 2016
To create an hic file use the GenomeFlow Funcions Convert mapped Hi-C reads to hic format file
- Create reference genome index
- Mapping raw FASTQ files
- Filter a BAM alignment file
- Convert a BAM file to Medium file format
- HiC-Express