Name: sambamba
Owner: BioD
Description: Tools for working with SAM/BAM/CRAM data
Created: 2012-04-28 13:46:53.0
Updated: 2017-12-19 09:32:07.0
Pushed: 2017-11-23 13:12:09.0
Homepage: http://thebird.nl/blog/D_Dragon.html
Size: 3135
Language: D
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Sambamba is a high performance modern robust and fast tool (and
library), written in the D programming language, for working with SAM
and BAM files. Current functionality is an important
subset of samtools functionality, including view, index, sort,
markdup, and depth. Most tools support piping: just specify /dev/stdin
or /dev/stdout
as filenames.
For almost 5 years the main advantage over samtools
was parallelized BAM reading.
Finally in March 2017 samtools
1.4 was released, reaching parity on this.
That said, we still have quite a few interesting features to offer:
sort
(no benchmarks yet, sorry)view -L <bed file>
utilizes BAM index to skip unrelated chunksdepth
allows to measure base, sliding window, or region coveragesmarkdup
, a fast implementation of Picard algorithmslice
quickly extracts a region into a new file, tweaking only first/last chunksSambamba is free and open source software, licensed under GPLv2+. See manual pages online to know more about what is available and how to use it.
For more information on Sambamba you can contact Artem Tarasov and Pjotr Prins.
For those not in the mood to learn/install new package managers, there are Github source and binary releases. Simply download the tarball, unpack it and run it. For example
https://github.com/biod/sambamba/releases/download/v0.6.6/sambamba_v0.6.6_linux.tar.bz2
xvjf sambamba_v0.6.6_linux.tar.bz2
mbamba_v0.6.6
sambamba 0.6.6
Usage: sambamba [command] [args...]
Available commands: 'view', 'index', 'merge', 'sort',
'flagstat', 'slice', 'markdup', 'depth', 'mpileup'
To get help on a particular command, just call it without args.
A latest pre-release of sambamba 0.6.7 for Linux that includes debug information and all dependencies is available from this link. This 24Mb download reflects the development edition and includes recent versions of libraries, samtools and bcftools. It should install on any Linux distribution, including old ones on HPC clusters.
Install the tarball by unpacking it and running the contained install script with a target directory e.g.
http://test-gn2.genenetwork.org/ipfs/QmakasNfZhdbPA3xJYNxNX7at5FtYnS4hUNnvDbzxhZf2J/hb13hjys1064jmb6z17yc1f822hv9zsz-sambamba-0.6.7-pre1-7cff065-x86_64.tar.bz2
xvjf QmakasNfZhdbPA3xJYNxNX7at5FtYnS4hUNnvDbzxhZf2J/hb13hjys1064jmb6z17yc1f822hv9zsz-sambamba-0.6.7-pre1-7cff065-x86_64.tar.bz2
stall.sh ~/sambamba-test
mbamba-test/bin/sambamba
sambamba 0.6.7-pre1
Usage: sambamba [command] [args...]
Available commands: 'view', 'index', 'merge', 'sort',
'flagstat', 'slice', 'markdup', 'depth', 'mpileup'
Binaries are also available through the following packaging tools (note the version numbers):
With Conda use the bioconda
channel.
A GNU Guix package for sambamba is available. The development version is packaged here.
Debian: see Debian packages.
Users of Homebrew can also use the formula from homebrew-science
.
Sambamba has a mailing list for installation help and general discussion.
Before posting an issue search the issue tracker and mailing list first. It is likely someone may have encountered something similar. Also try running the latest version of sambamba to make sure it has not been fixed already. Support/installation questions should be aimed at the mailing list. The issue tracker is for development issues around the software itself. When reporting an issue include the output of the program and the contents of the output directory.
To find bugs the sambamba software developers may ask to install a development version of the software. They may also ask you for your data and will treat it confidentially. Please always remember that sambamba is written and maintained by volunteers with good intentions. Our time is valuable too. By helping us as much as possible we can provide this tool for everyone to use.
By using sambamba and communicating with its communtity you implicitely agree to abide by the code of conduct as published by the Software Carpentry initiative.
Note: in general there is no need to compile sambamba. You can use a recent binary install as listed above.
The preferred method for compiling Sambamba is with the LDC compiler which targets LLVM.
The LDC compiler's github repository provides binary images. The current preferred release for sambamba is LDC - the LLVM D compiler (>= 1.6.1). After installing LDC from https://github.com/ldc-developers/ldc/releases/ with, for example
https://github.com/ldc-developers/ldc/releases/download/v$ver/ldc2-1.7.0-linux-x86_64.tar.xz
xvJf ldc2-1.7.0-linux-x86_64.tar.xz
rt PATH=$HOME/ldc2-1.7.0-linux-x86_64/bin:$PATH
rt LIBRARY_PATH=$HOME/ldc2-1.7.0-linux-x86_64/lib
h
clone --recursive https://github.com/biod/sambamba.git
ambamba
To build a debug release run
clean && make debug
To run the test fetch shunit2 from https://github.com/kward/shunit2 and put it in the path so you can run
check
To build sambamba the LDC compiler is also available in GNU Guix:
package -i ldc
Note: the Makefile does not work. Someone want to fix that using the Makefile.old version? See also https://github.com/biod/sambamba/issues/338.
brew install ldc
git clone --recursive https://github.com/biod/sambamba.git
cd sambamba
git clone https://github.com/dlang/undeaD
make sambamba-ldmd2-64
Sambamba development and issue tracker is on github. Developer documentation can be found in the source code and the development documentation.
Important note: some popular Xeon processors segfault under heavy hyper threading - which Sambamba utilizes. Please read this when encountering seemingly random crashes.
In a crash sambamba can dump a core file. To make this happen set
it -c unlimited
and run your command. Send us the core file so we can reproduce the state at time of segfault.
Another option is to use catchsegv
hsegv ./build/sambamba command
this will show state on stdout which can be sent to us.
In case of crashes it's helpful to have GDB stacktraces (bt
command). A full stacktrace for all threads:
ad apply all backtrace full
Note that GDB should be made aware of D garbage collector:
le SIGUSR1 SIGUSR2 nostop noprint
A binary relocatable install of sambamba with debug information and all dependencies can be fetched from the binary link above. Unpack the tarball and run the contained install.sh script with TARGET
stall.sh ~/sambamba-test
Run sambamba in gdb with
-ex 'handle SIGUSR1 SIGUSR2 nostop noprint' \
args ~/sambamba-test/sambamba-*/bin/sambamba view --throw-error
Sambamba is distributed under GNU Public License v2+.
If you are using Sambamba in your research and want to support future work on Sambamba, please cite the following publication:
A. Tarasov, A. J. Vilella, E. Cuppen, I. J. Nijman, and P. Prins. Sambamba: fast processing of NGS alignment formats. Bioinformatics, 2015.
icle{doi:10.1093/bioinformatics/btv098,
thor = {Tarasov, Artem and Vilella, Albert J. and Cuppen, Edwin and Nijman, Isaac J. and Prins, Pjotr},
tle = {Sambamba: fast processing of NGS alignment formats},
urnal = {Bioinformatics},
lume = {31},
mber = {12},
ges = {2032-2034},
ar = {2015},
i = {10.1093/bioinformatics/btv098},
L = { + http://dx.doi.org/10.1093/bioinformatics/btv098}