Bioinformaticsnl/allbiotc2

Name: allbiotc2

Owner: Bionformatics Netherlands

Description: Benchmark pipeline for Structural Variation analyses, funded by the ALLBio

Created: 2014-05-16 10:30:13.0

Updated: 2014-05-16 10:30:14.0

Pushed: 2014-04-22 18:07:44.0

Homepage: http://www.allbioinformatics.eu/doku.php?id=start

Size: 10782

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

SV-Autopilot

Structural Variation AUTOmated PIpeLine Optimization Tool

by : Wai Yi Leung, Tobias Marschall, Laurent Falquet, Yogesh Paudel, Hailiang Mei, Alex Schoenhuth and Tiffanie Yael Moss

This repository is used to store scripts written during the hackathon of ALLBio Testcase 2.

We aim at providing :

More information about the project can be found at the following websites:

ALLBio Bioinformatics, Testcase#2, Google site, members only!

How to install

Grab a copy of this repository from GitHub to your home folder and store this in allbiotc2:

cd ~
git clone https://github.com/ALLBio/allbiotc2.git
cd allbiotc2/
make install

The make install command will do a system-wide install. This step requires sudo rights.

Installation instructions for sysadmins (advanced)

Please take a closer look in the following repository where the installation scripts are located. These scripts were used to install the workshop-ready and production-ready virtual machine.

https://github.com/ALLBio/allbiovm

Comments are welcome via the ticketing system from Github.

Preprocessing reference VCF (optional)

If reference calls are provided in SDI format, the following procedure can be followed to convert from SDI to VCF.

make -f ../scripts/Makefile \
    REFERENCE_VCF=~/myworkdir/ref_all.complete.vcf \
    SDI_FILE=~/myworkdir/ler_0.v7c.sdi \
    preprocess
Installing the software

The software for the pipeline is placed into one central location in the following setup:

allbio@workbench:/virdir/Scratch/software$ tree -L 1
.
??? bowtie2-2.1.0
??? breakdancer
??? bwa-0.7.4
??? circos-0.63-4
??? clever-sv
??? delly_v0.0.9
??? dwac-seq0.7
??? FastQC
??? gasv
??? picard-tools-1.86
??? pindel
??? PRISM_1_1_6
??? samtools-0.1.19
??? sickle-master
??? SVDetect_r0.8b
Running the pipeline

Configuration can be done in the conf.mk and upon invocation of the pipeline by passing them via the commandline.

The most important and required variables are:

Example invocation of the pipeline:

THREADS=8

make -f ../scripts/Makefile \
    PROGRAMS=/virdir/Scratch/software\
    REFERENCE_DIR=../input/reference_tair9 \
    FASTQC_THREADS=$THREADS \
    BWA_OPTION_THREADS=$THREADS \
    PEA_MARK=.1 \
    PEB_MARK=.2 \
    FASTQ_EXTENSION=fastq \
    REFERENCE_VCF=/virdir/Backup/reads_and_reference/vcf_reference/ref_all.complete.vcf 
Example setup of pipeline directories
allbio@workbench:/opt/allbio/runs/synthetic_run$ tree -L 1
.
??? input
?   ??? reference_tair10
?   ?   ??? bowtie2
?   ?   ??? bwa
?   ?   ??? reference.fa
?   ?   ??? reference.fa.fai
?   ??? sim-reads_1.fastq
?   ??? sim-reads_2.fastq
?   ??? sim-reads.409_10.1.fastq
?   ??? sim-reads.409_10.2.fastq
?   ??? sim-reads.511_10.1.fastq
?   ??? sim-reads.511_10.2.fastq
??? log
??? run_integrationtest
?   ??? bd.cfg
?   ??? comparison.tex
?   ??? run.sh
?   ??? sim-read-511_10.1.fastq -> ../input/sim-reads.511_10.1.fastq
?   ??? sim-read-511_10.1.filtersync.stats
?   ??? sim-read-511_10.1.singles.fastq
?   ??? sim-read-511_10.1.trimmed.fastq
?   ??? sim-read-511_10.2.fastq -> ../input/sim-reads.511_10.2.fastq
?   ??? sim-read-511_10.2.trimmed.fastq
?   ??? sim-read-511_10.bam
?   ??? sim-read-511_10.bam.bai
?   ??? sim-read-511_10.bd.vcf
?   ??? sim-read-511_10.breakdancer
?   ??? sim-read-511_10.delly
?   ??? sim-read-511_10.delly.vcf
?   ??? sim-read-511_10.flagstat
?   ??? sim-read-511_10.gasv
?   ??? sim-read-511_10.gasv.vcf
?   ??? sim-read-511_10.pindel
?   ??? sim-read-511_10.pindel.vcf
?   ??? sim-read-511_10.prism
?   ??? sim-read-511_10.prism.vcf
?   ??? sim-read-511_10.raw_fastqc
?   ??? sim-read-511_10.sam
?   ??? sim-read-511_10.trimmed_fastqc
?   ??? sim-read-511_10.unsort.bam
??? scripts
    ??? Makefile -> ~/allbiotc2/Makefile

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.