Skip to content

Commit 97c5b3d

Browse files
committed
README
1 parent 6485680 commit 97c5b3d

File tree

1 file changed

+25
-21
lines changed

1 file changed

+25
-21
lines changed

README.md

Lines changed: 25 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,12 @@
1-
[![Build Status](https://travis-ci.org/biod/sambamba.svg?branch=master)](https://travis-ci.org/biod/sambamba) [![AnacondaBadge](https://anaconda.org/bioconda/sambamba/badges/installer/conda.svg)](https://anaconda.org/bioconda/sambamba) [![DL](https://anaconda.org/bioconda/sambamba/badges/downloads.svg)](https://anaconda.org/bioconda/sambamba) [![BrewBadge](https://img.shields.io/badge/%F0%9F%8D%BAbrew-sambamba-brightgreen.svg)](https://github.com/brewsci/homebrew-bio)
1+
g[![Build Status](https://travis-ci.org/biod/sambamba.svg?branch=master)](https://travis-ci.org/biod/sambamba) [![AnacondaBadge](https://anaconda.org/bioconda/sambamba/badges/installer/conda.svg)](https://anaconda.org/bioconda/sambamba) [![DL](https://anaconda.org/bioconda/sambamba/badges/downloads.svg)](https://anaconda.org/bioconda/sambamba) [![BrewBadge](https://img.shields.io/badge/%F0%9F%8D%BAbrew-sambamba-brightgreen.svg)](https://github.com/brewsci/homebrew-bio)
22
[![GuixBadge](https://img.shields.io/badge/gnuguix-sambamba-brightgreen.svg)](https://www.gnu.org/software/guix/packages/S/)
33
[![DebianBadge](https://badges.debian.net/badges/debian/testing/sambamba/version.svg)](https://packages.debian.org/testing/sambamba)
44

5-
# sambamba
5+
# SAMBAMBA
66

77
Table of Contents
88
=================
99

10-
* [sambamba](#sambamba)
11-
* [Table of Contents](#table-of-contents)
1210
* [Introduction](#introduction)
1311
* [Binary installation](#binary-installation)
1412
* [Install stable release](#install-stable-release)
@@ -40,41 +38,45 @@ Table of Contents
4038

4139
Sambamba is a high performance highly parallel robust and fast tool
4240
(and library), written in the D programming language, for working with
43-
SAM and BAM files. Because of its efficiency is an important work
44-
horse running in many sequencing centres around the world
45-
today.
41+
SAM and BAM files. Because of its efficiency Sambamba is an important
42+
work horse running in many sequencing centres around the world today.
43+
As of November 2019, Sambamba has been cited over
44+
[277 times](http://scholar.google.nl/citations?hl=en&user=5ijHQRIAAAAJ)
45+
and has been installed from Conda over
46+
[120K times](https://anaconda.org/bioconda/sambamba).
4647

4748
Current functionality is an important subset of samtools
4849
functionality, including view, index, sort, markdup, and depth. Most
4950
tools support piping: just specify `/dev/stdin` or `/dev/stdout` as
5051
filenames. When we started writing sambamba (in 2012) the main
5152
advantage over `samtools` was parallelized BAM reading and writing.
52-
In March 2017 `samtools` 1.4 was released, reaching parity on this. A
53+
In March 2017 `samtools` 1.4 was released, reaching parity at least on
54+
architecture. A
5355
[recent performance comparison](https://github.com/guigolab/sambamBench-nf)
54-
shows that sambamba holds its ground and can do better in different
55-
configurations. Here are some comparison
56+
shows that sambamba still holds its ground and can even do better.
57+
Here are some comparison
5658
[metrics](https://public-docs.crg.es/rguigo/Data/epalumbo/sambamba_ws_report.html). For
57-
example for flagstat sambamba is 1.4x faster than samtools. For index
59+
example, for flagstat sambamba is 1.4x faster than samtools. For index
5860
they are similar. For Markdup almost 6x faster and for view 4x
59-
faster. For sort sambamba has been beaten, though sambamba is up to 2x
60-
faster than samtools on large RAM machines (120GB+).
61+
faster. For sort sambamba has been beaten, though sambamba is notably
62+
up to 2x faster than samtools on large RAM machines (120GB+).
6163

6264
In addition sambamba has a few interesting features to offer, in particular
6365

64-
- faster large machine `sort`, see [performance](./test/benchmark/stats.org)
66+
- fast large machine `sort`, see [performance](./test/benchmark/stats.org)
6567
- automatic index creation when writing any coordinate-sorted file
6668
- `view -L <bed file>` utilizes BAM index to skip unrelated chunks
6769
- `depth` allows to measure base, sliding window, or region coverages
6870
- [Chanjo](https://www.chanjo.co/) builds upon this and gets you to exon/gene levels of abstraction
6971
- `markdup`, a fast implementation of Picard algorithm
7072
- `slice` quickly extracts a region into a new file, tweaking only first/last chunks
71-
- and more
73+
- and more (you'll have to try)
7274

7375
Even though Sambamba started out as a samtools clone we are now in the
7476
process of adding new functionality - also in the
7577
[BioD project](https://github.com/biod/BioD). The D language is
76-
extremely suitable for high performance computing. At this point we
77-
think that the BAM format is here to stay for processing sequencing
78+
extremely suitable for high performance computing (HPC). At this point
79+
we think that the BAM format is here to stay for processing sequencing
7880
data and we aim to make it easy to parse and process BAM files.
7981

8082
Sambamba is free and open source software, licensed under GPLv2+.
@@ -289,7 +291,8 @@ command). A full stacktrace for all threads:
289291
thread apply all backtrace full
290292
```
291293

292-
Note that GDB should be made aware of D garbage collector:
294+
Note that GDB should be made aware of D garbage collector which emits
295+
SIGUSR signals and gdb needs to ignore them with
293296

294297
```
295298
handle SIGUSR1 SIGUSR2 nostop noprint
@@ -313,13 +316,14 @@ gdb -ex 'handle SIGUSR1 SIGUSR2 nostop noprint' \
313316
<a name="license"></a>
314317
# License
315318

316-
Sambamba is distributed under GNU Public License v2+.
319+
Sambamba is generously distributed under GNU Public License v2+.
317320

318321
<a name="credits"></a>
319322
# Credit
320323

321-
If you are using Sambamba in your research and want to support future
322-
work on Sambamba, please cite the following publication:
324+
Citations are the bread and butter of Science. If you are using
325+
Sambamba in your research and want to support our future work on
326+
Sambamba, please cite the following publication:
323327

324328
A. Tarasov, A. J. Vilella, E. Cuppen, I. J. Nijman, and P. Prins. [Sambamba: fast processing of NGS alignment formats](https://doi.org/10.1093/bioinformatics/btv098). Bioinformatics, 2015.
325329

0 commit comments

Comments
 (0)