File tree Expand file tree Collapse file tree 2 files changed +30
-0
lines changed Expand file tree Collapse file tree 2 files changed +30
-0
lines changed Original file line number Diff line number Diff line change @@ -19,6 +19,7 @@ Reconstruct bins with single or co-assembly binning using one command.
19
19
* ` -i/--input-fasta ` : Path to the input contig fasta file (` gzip ` and ` bzip2 ` compression are accepted).
20
20
* ` -b/--input-bam ` : Path to the input BAM (` .bam ` extension) or CRAM (` .cram ` ) files. You can pass multiple BAM files, one per sample.
21
21
* ` -o/--output ` : Output directory (will be created if non-existent).
22
+ * ` -a/--abundance ` Path to the abundance file from strobealign-aemb. This can only be used when samples used in binning above or equal 5.
22
23
23
24
#### Recommended arguments
24
25
@@ -126,6 +127,7 @@ These are the are same as for `single_easy_bin`.
126
127
* ` --ml-threshold `
127
128
* ` --taxonomy-annotation-table `
128
129
* ` --tmpdir `
130
+ * ` -a/--abundance `
129
131
130
132
These are the are same as for ` single_easy_bin ` .
131
133
@@ -138,6 +140,7 @@ The subcommand `generate_sequence_features_single` requires the contig file and
138
140
* ` -i/--input-fasta `
139
141
* ` -b/--input-bam `
140
142
* ` -o/--output `
143
+ * ` -a/--abundance `
141
144
142
145
These are the are same as for ` single_easy_bin ` .
143
146
@@ -161,6 +164,7 @@ The subcommand `generate_sequence_features_multi` requires the combined contig f
161
164
* ` -i/--input-fasta `
162
165
* ` -o/--output `
163
166
* ` -b/--input-bam `
167
+ * ` -a/--abundance `
164
168
165
169
These are the same as for ` multi_easy_bin ` .
166
170
Original file line number Diff line number Diff line change @@ -419,3 +419,29 @@ SemiBin2 generate_cannot_links -i S5.fa -o S5_output
419
419
420
420
See the comment above about how you can bypass most of the computation if you have run ` mmseqs2 ` to annotate your contigs against GTDB already.
421
421
422
+
423
+ ## Running SemiBin with strobealign-aemb
424
+
425
+ Strobealign-aemb is a fast abundance estimation method for metagenomic binning.
426
+ As strobealign-aemb can not provide the mapping information for every position of the contig, so we can not run SemiBin2 with strobealign-aemb in binning modes where samples used smaller 5 and need to split the contigs to generate the must-link constratints.
427
+
428
+
429
+ 1 . Split the fasta files
430
+ ``` bash
431
+ python script/generate_split.py -c contig.fa -o output
432
+ ```
433
+ 2 . Map reads using [ strobealign-aemb] ( https://github.com/ksahlin/strobealign ) to generate the abundance information
434
+ ``` bash
435
+ strobealign --aemb output/split.fa read1_1.fq read1_2.fq -R 6 > sample1.txt
436
+ strobealign --aemb output/split.fa read2_1.fq read2_2.fq -R 6 > sample2.txt
437
+ strobealign --aemb output/split.fa read3_1.fq read3_2.fq -R 6 > sample3.txt
438
+ strobealign --aemb output/split.fa read4_1.fq read4_2.fq -R 6 > sample4.txt
439
+ strobealign --aemb output/split.fa read5_1.fq read5_2.fq -R 6 > sample5.txt
440
+ ```
441
+ 3 . Run SemiBin2 (like running SemiBin with BAM files)
442
+ ``` bash
443
+ SemiBin2 generate_sequence_features_single -i contig.fa -a * .txt -o output
444
+ SemiBin2 generate_sequence_features_multi -i contig.fa -a * .txt -s : -o output
445
+ SemiBin2 single_easy_bin -i contig.fa -a * .txt -o output
446
+ SemiBin2 multi_easy_bin i contig.fa -a * .txt -s : -o output
447
+
You can’t perform that action at this time.
0 commit comments