Bioinformaticsnl/Jellyfish

Name: Jellyfish

Owner: Bionformatics Netherlands

Description: A fast multi-threaded k-mer counter

Created: 2014-04-14 22:44:39.0

Updated: 2014-04-14 22:44:40.0

Pushed: 2014-04-11 11:50:49.0

Homepage: null

Size: 3302

Language: C++

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Jellyfish

Overview

Jellyfish is a tool for fast, memory-efficient counting of k-mers in DNA. A k-mer is a substring of length k, and counting the occurrences of all such substrings is a central step in many analyses of DNA sequence. Jellyfish can count k-mers using an order of magnitude less memory and an order of magnitude faster than other k-mer counting packages by using an efficient encoding of a hash table and by exploiting the “compare-and-swap” CPU instruction to increase parallelism.

JELLYFISH is a command-line program that reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts in an binary format, which can be translated into a human-readable text format using the “jellyfish dump” command. See the documentation below for more details.

If you use Jellyfish in your research, please cite:

Guillaume Marcais and Carl Kingsford, A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics (2011) 27(6): 764-770 (first published online January 7, 2011) doi:10.1093/bioinformatics/btr011

Installation

To get packaged tar ball of the source code, see the home page of Jellyfish at the University of Maryland.

To compile from the git tree, you will need autoconf/automake, make, g++ 4.4 or newer and yaggo. Then compile with:

reconf -i
nfigure

 make install

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.