Skip to content

Commit 360dad1

Browse files
committed
Only keep GFA file and rename it contigs.gfa
1 parent 1f863e3 commit 360dad1

File tree

2 files changed

+17
-11
lines changed

2 files changed

+17
-11
lines changed

README.md

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -71,12 +71,11 @@ You will need to install all the dependencies manually:
7171
* FLASH
7272
* SAMtools >= 1.3
7373
* BWA MEM
74-
* PILON
75-
* KMC
74+
* KMC >= 2
7675
* seqtk
7776
* pigz
78-
* Java
79-
* Trimmomatic
77+
* Pilon (Java)
78+
* Trimmomatic (Java)
8079

8180
## Output files
8281

@@ -89,12 +88,17 @@ The FASTA description of each sequence in `contigs.fa` have space-separated
8988
`name=value` pairs with the length in bases (`len`), the average coverage
9089
(`cov`), the number of post-assembly SNP/indel corrections made (`corr`),
9190
and the original contig name from Spades (`spades`). Two examples are:
92-
9391
```
9492
>contig00001 len=263154 cov=8.9 corr=1 spades=NODE_1_length_263154_cov_8.86703_pilon
9593
>contig00041 len=339 cov=8.8 corr=0 spades=NODE_41_length_339_cov_8.77027_pilon
9694
```
9795

96+
The (uncorrected) assembly graph file for viewing in
97+
[Bandage](https://rrwick.github.io/Bandage/) is available too:
98+
```
99+
contigs.gfa
100+
```
101+
98102
There is a log file for each of the tools used to generate the assembly:
99103
```
100104
00-shovill.log
@@ -112,8 +116,6 @@ As Spades is the most important tool used, some useful output files are
112116
kept.<br>
113117
&#9888; Do not confuse the final `contigs.fa` with these files!
114118
```
115-
assembly_graph.fastg
116-
assembly_graph.gfa
117119
before_rr.fasta
118120
contigs.fasta
119121
scaffolds.fasta
@@ -136,7 +138,7 @@ pilon.changes
136138
--force Force overwite of existing output folder (default: OFF)
137139
--R1 XXX Read 1 FASTQ (default: '')
138140
--R2 XXX Read 2 FASTQ (default: '')
139-
--depth N Sub-sample --R1/--R2 to this depth. Disable with --depth 0 (default: 50)
141+
--depth N Sub-sample --R1/--R2 to this depth. Disable with --depth 0 (default: 100)
140142
--gsize XXX Estimated genome size <blank=AUTODETECT> (default: '')
141143
--kmers XXX K-mers to use <blank=AUTO> (default: '')
142144
--opts XXX Extra SPAdes options eg. --plasmid --sc ... (default: '')
@@ -166,5 +168,4 @@ Not published yet.
166168
## Authors
167169

168170
* **Torsten Seemann**
169-
* Jason Kwong
170-
* Anders Goncalves da Silva
171+
* Jason Kwong, Simon Gladman, Anders Goncalves da Silva

bin/shovill

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -229,13 +229,17 @@ for my $id (sort { $len{$b} <=> $len{$a} } keys %{$seq}) {
229229
}
230230
write_fasta("contigs.fa", $seq);
231231

232+
move("assembly_graph.gfa", "contigs.gfa");
233+
232234
msg("Writing $EXE log file.");
233235
move("spades.log", "60-spades.log");
234236

235237
# Cleanup time!
236238
unless ($keepfiles) {
237239
# corrected, overlapped and symlinked original reads
238240
unlink glob("*q.gz");
241+
# keep the .gfa and delete the .fastg
242+
unlink "assembly_graph.fastg";
239243
# BWA indices
240244
unlink glob("$asm.fasta.*");
241245
unlink $BAM, "$BAM.bai";
@@ -259,7 +263,8 @@ msg("Walltime used: $pretty");
259263
#msg("Type '$EXE --citation' for more details.");
260264

261265
msg("Results in: $outdir");
262-
msg("Final assembly in: $outdir/contigs.fa");
266+
msg("Final assembly graph: $outdir/contigs.gfa");
267+
msg("Final assembly contigs: $outdir/contigs.fa");
263268
msg("It contains $ncontigs (min=$minlen) contigs totalling $nbases bp.");
264269

265270
# Inspiration

0 commit comments

Comments
 (0)