Skip to content

Commit b6f3e09

Browse files
Merge pull request #78 from bnovak32/issue-76-metagenomics-zipfiles
Metagenomics pipeline update: zip sample bins and MAGs
2 parents 769ad5f + ad46e11 commit b6f3e09

File tree

1 file changed

+15
-3
lines changed

1 file changed

+15
-3
lines changed

Metagenomics/Illumina/Pipeline_GL-DPPD-7107_Versions/GL-DPPD-7107.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -772,6 +772,10 @@ bit-GL-combine-KO-and-tax-tables *-gene-coverage-annotation-and-tax.tsv -o Combi
772772
jgi_summarize_bam_contig_depths --outputDepth sample-1-metabat-assembly-depth.tsv --percentIdentity 97 --minContigLength 1000 --minContigDepth 1.0 --referenceFasta sample-1-assembly.fasta sample-1.bam
773773
774774
metabat2 --inFile sample-1-assembly.fasta --outFile sample-1 --abdFile sample-1-metabat-assembly-depth.tsv -t 4
775+
776+
mkdir sample-1-bins
777+
mv sample-1*bin*.fasta sample-1-bins
778+
zip -r sample-1-bins.zip sample-1-bins
775779
```
776780

777781
**Parameter Definitions:**
@@ -796,7 +800,8 @@ metabat2 --inFile sample-1-assembly.fasta --outFile sample-1 --abdFile sample-1
796800
**Output data:**
797801

798802
* **sample-1-metabat-assembly-depth.tsv** (tab-delimited summary of coverages)
799-
* **sample-1-bin\*.fasta** (fasta files of recovered bins)
803+
* sample-1-bins/sample-1-bin\*.fasta (fasta files of recovered bins)
804+
* **sample-1-bins.zip** (zip file containing fasta files of recovered bins)
800805

801806
#### 14b. Bin quality assessment
802807
Utilizes the default `checkm` database available [here](https://data.ace.uq.edu.au/public/CheckM_databases/checkm_data_2015_01_16.tar.gz), `checkm_data_2015_01_16.tar.gz`.
@@ -839,6 +844,13 @@ do
839844
MAG_ID=$(echo $ID | sed 's/bin./MAG-/')
840845
cp ${ID}.fasta MAGs/${MAG_ID}.fasta
841846
done
847+
848+
for SAMPLE in $(cat MAG-bin-IDs.tmp | sed 's/-bin.*//' | sort -u);
849+
do
850+
mkdir ${SAMPLE}-MAGs
851+
mv ${SAMPLE}-*MAG*.fasta ${SAMPLE}-MAGs
852+
zip -r ${SAMPLE}-MAGs.zip ${SAMPLE}-MAGs
853+
done
842854
```
843855

844856
**Input data:**
@@ -848,8 +860,8 @@ done
848860
**Output data:**
849861

850862
* checkm-MAGs-overview.tsv (tab-delimited file with quality estimates per MAG)
851-
* **MAGs/\*.fasta** (directory holding high-quality MAGs)
852-
863+
* MAGs/\*.fasta (directory holding high-quality MAGs)
864+
* **\*-MAGs.zip** (zip files containing directories of high-quality MAGs)
853865

854866

855867
#### 14d. MAG taxonomic classification

0 commit comments

Comments
 (0)