cbcrg/kallisto-nf-reproduce

Name: kallisto-nf-reproduce

Owner: Notredame Lab

Description: Experiment illustrating how Nextflow can make biological pipelines reproducible

Forked from: skptic/kallisto-nf-reproduce

Created: 2016-05-22 21:51:58.0

Updated: 2016-05-22 21:52:02.0

Pushed: 2017-03-20 16:50:12.0

Homepage: null

Size: 704037

Language: R

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

kallisto-nf-reproduce

This repository contains the software, scripts and data to reproduce the RNA-Seq results decribed in the Nextflow publication.

The repository contains two versions of a tradtional bash style pipeline for Mac and Linux (kallisto-mac and kallisto-linux) as well as the Nextflow version of the pipeline compatible across platforms (kallisto-nf).

Folder structure
How to replicate result
Clone the repository

kallisto-nf exisits as a git submodule within this repository. To clone the repository, including the submodule, one can include the --recursive flag:

git clone --recursive https://github.com/cbcrg/kallisto-nf-reproduce.git
cd kallisto-nf-reproduce
Data

All data is available from the original sources, as well as a compressed tarball (~22GB).

To download and uncompress the data use the following command:

mkdir data
wget -O- https://zenodo.org/record/159158/files/kallisto_data.tar.gz | tar xz -C data
Original Sources

If you wish to retrieve the data from the original sources, you can find it here:

Native Linux

Install Kallisto version 0.42.4.

Install Sleuth

Launch the kallisto bash pipeline script running the following command:

./kallisto-linux/kallisto-std.sh \
    data/raw_reads \
    data/transcriptome/Homo_sapiens.GRCh38.rel79.cdna.all.fa  \
    data/exp_info/hiseq_info.txt \
    results-linux
Native Mac

Install Kallisto version 0.42.4.

Install Sleuth

Launch the kallisto bash pipeline script running the following command:

./kallisto-mac/kallisto-std.sh \
    data/raw_reads \
    data/transcriptome/Homo_sapiens.GRCh38.rel79.cdna.all.fa  \
    data/exp_info/hiseq_info.txt \
    results-mac
Nextflow (Mac & Linux)

Install Nextflow with the following command:

curl -fsSL get.nextflow.io | bash

Install Docker following the instruction at this page.

Pull the Docker images used for this experiment (optional):

docker pull cbcrg/kallisto-nf@sha256:9f840127392d04c9f8e39cb72bcd62ff53cfe0492dde02dc3749bf15f1c547f1 

Once the read data has been downloaded from SRA, it is possible to reproduce the Nextflow version of the pipeline from the kallisto-nf directory using the following command:

nextflow run kallisto-nf/kallisto.nf \
    --reads 'data/raw_reads/SRR4933*_{1,2}.fastq' \
    --transcriptome data/transcriptome/Homo_sapiens.GRCh38.rel79.cdna.all.fa \
    --experiment data/exp_info/hiseq_info.txt \
    --output kallisto-nf-results \
    -with-docker

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.