1
- [ ![ Build Status] ( https://travis-ci.org/biod/sambamba.svg?branch=master )] ( https://travis-ci.org/biod/sambamba ) [ ![ AnacondaBadge] ( https://anaconda.org/bioconda/sambamba/badges/installer/conda.svg )] ( https://anaconda.org/bioconda/sambamba ) [ ![ DL] ( https://anaconda.org/bioconda/sambamba/badges/downloads.svg )] ( https://anaconda.org/bioconda/sambamba ) [ ![ BrewBadge] ( https://img.shields.io/badge/%F0%9F%8D%BAbrew-sambamba-brightgreen.svg )] ( https://github.com/brewsci/homebrew-bio )
1
+ g [ ![ Build Status] ( https://travis-ci.org/biod/sambamba.svg?branch=master )] ( https://travis-ci.org/biod/sambamba ) [ ![ AnacondaBadge] ( https://anaconda.org/bioconda/sambamba/badges/installer/conda.svg )] ( https://anaconda.org/bioconda/sambamba ) [ ![ DL] ( https://anaconda.org/bioconda/sambamba/badges/downloads.svg )] ( https://anaconda.org/bioconda/sambamba ) [ ![ BrewBadge] ( https://img.shields.io/badge/%F0%9F%8D%BAbrew-sambamba-brightgreen.svg )] ( https://github.com/brewsci/homebrew-bio )
2
2
[ ![ GuixBadge] ( https://img.shields.io/badge/gnuguix-sambamba-brightgreen.svg )] ( https://www.gnu.org/software/guix/packages/S/ )
3
3
[ ![ DebianBadge] ( https://badges.debian.net/badges/debian/testing/sambamba/version.svg )] ( https://packages.debian.org/testing/sambamba )
4
4
5
- # sambamba
5
+ # SAMBAMBA
6
6
7
7
Table of Contents
8
8
=================
9
9
10
- * [ sambamba] ( #sambamba )
11
- * [ Table of Contents] ( #table-of-contents )
12
10
* [ Introduction] ( #introduction )
13
11
* [ Binary installation] ( #binary-installation )
14
12
* [ Install stable release] ( #install-stable-release )
@@ -40,41 +38,45 @@ Table of Contents
40
38
41
39
Sambamba is a high performance highly parallel robust and fast tool
42
40
(and library), written in the D programming language, for working with
43
- SAM and BAM files. Because of its efficiency is an important work
44
- horse running in many sequencing centres around the world
45
- today.
41
+ SAM and BAM files. Because of its efficiency Sambamba is an important
42
+ work horse running in many sequencing centres around the world today.
43
+ As of November 2019, Sambamba has been cited over
44
+ [ 277 times] ( http://scholar.google.nl/citations?hl=en&user=5ijHQRIAAAAJ )
45
+ and has been installed from Conda over
46
+ [ 120K times] ( https://anaconda.org/bioconda/sambamba ) .
46
47
47
48
Current functionality is an important subset of samtools
48
49
functionality, including view, index, sort, markdup, and depth. Most
49
50
tools support piping: just specify ` /dev/stdin ` or ` /dev/stdout ` as
50
51
filenames. When we started writing sambamba (in 2012) the main
51
52
advantage over ` samtools ` was parallelized BAM reading and writing.
52
- In March 2017 ` samtools ` 1.4 was released, reaching parity on this. A
53
+ In March 2017 ` samtools ` 1.4 was released, reaching parity at least on
54
+ architecture. A
53
55
[ recent performance comparison] ( https://github.com/guigolab/sambamBench-nf )
54
- shows that sambamba holds its ground and can do better in different
55
- configurations. Here are some comparison
56
+ shows that sambamba still holds its ground and can even do better.
57
+ Here are some comparison
56
58
[ metrics] ( https://public-docs.crg.es/rguigo/Data/epalumbo/sambamba_ws_report.html ) . For
57
- example for flagstat sambamba is 1.4x faster than samtools. For index
59
+ example, for flagstat sambamba is 1.4x faster than samtools. For index
58
60
they are similar. For Markdup almost 6x faster and for view 4x
59
- faster. For sort sambamba has been beaten, though sambamba is up to 2x
60
- faster than samtools on large RAM machines (120GB+).
61
+ faster. For sort sambamba has been beaten, though sambamba is notably
62
+ up to 2x faster than samtools on large RAM machines (120GB+).
61
63
62
64
In addition sambamba has a few interesting features to offer, in particular
63
65
64
- - faster large machine ` sort ` , see [ performance] ( ./test/benchmark/stats.org )
66
+ - fast large machine ` sort ` , see [ performance] ( ./test/benchmark/stats.org )
65
67
- automatic index creation when writing any coordinate-sorted file
66
68
- ` view -L <bed file> ` utilizes BAM index to skip unrelated chunks
67
69
- ` depth ` allows to measure base, sliding window, or region coverages
68
70
- [ Chanjo] ( https://www.chanjo.co/ ) builds upon this and gets you to exon/gene levels of abstraction
69
71
- ` markdup ` , a fast implementation of Picard algorithm
70
72
- ` slice ` quickly extracts a region into a new file, tweaking only first/last chunks
71
- - and more
73
+ - and more (you'll have to try)
72
74
73
75
Even though Sambamba started out as a samtools clone we are now in the
74
76
process of adding new functionality - also in the
75
77
[ BioD project] ( https://github.com/biod/BioD ) . The D language is
76
- extremely suitable for high performance computing. At this point we
77
- think that the BAM format is here to stay for processing sequencing
78
+ extremely suitable for high performance computing (HPC) . At this point
79
+ we think that the BAM format is here to stay for processing sequencing
78
80
data and we aim to make it easy to parse and process BAM files.
79
81
80
82
Sambamba is free and open source software, licensed under GPLv2+.
@@ -289,7 +291,8 @@ command). A full stacktrace for all threads:
289
291
thread apply all backtrace full
290
292
```
291
293
292
- Note that GDB should be made aware of D garbage collector:
294
+ Note that GDB should be made aware of D garbage collector which emits
295
+ SIGUSR signals and gdb needs to ignore them with
293
296
294
297
```
295
298
handle SIGUSR1 SIGUSR2 nostop noprint
@@ -313,13 +316,14 @@ gdb -ex 'handle SIGUSR1 SIGUSR2 nostop noprint' \
313
316
<a name =" license " ></a >
314
317
# License
315
318
316
- Sambamba is distributed under GNU Public License v2+.
319
+ Sambamba is generously distributed under GNU Public License v2+.
317
320
318
321
<a name =" credits " ></a >
319
322
# Credit
320
323
321
- If you are using Sambamba in your research and want to support future
322
- work on Sambamba, please cite the following publication:
324
+ Citations are the bread and butter of Science. If you are using
325
+ Sambamba in your research and want to support our future work on
326
+ Sambamba, please cite the following publication:
323
327
324
328
A. Tarasov, A. J. Vilella, E. Cuppen, I. J. Nijman, and P. Prins. [ Sambamba: fast processing of NGS alignment formats] ( https://doi.org/10.1093/bioinformatics/btv098 ) . Bioinformatics, 2015.
325
329
0 commit comments