Name: lncRNA-Annotation-nf
Owner: Notredame Lab
Description: A Nextflow lncRNA Annotation Pipeline based on STAR, Cufflinks and FEELnc
Forked from: skptic/lncRNA-Annotation-nf
Created: 2016-05-30 11:45:53.0
Updated: 2018-01-12 03:31:49.0
Pushed: 2017-10-08 16:33:41.0
Homepage: null
Size: 227386
Language: Awk
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
A Nextflow implementation of a lncRNA Annotation Pipeline
Make sure you have all the required dependencies listed in the last section or run with Docker.
Install the Nextflow runtime by running the following command:
$ curl -fsSL get.nextflow.io | bash
When done, you can launch the pipeline execution with Docker by entering the command shown below:
$ docker pull cbcrg/lncrna_annotation
$ nextflow run cbcrg/lncRNA-Annotation-nf -profile test
By default the pipeline is executed against the provided example dataset.
Check the Pipeline parameters section below to see how enter your data on the program
command line.
--reads
.fastq.gz
../tutorial/reads/*.fastq.gz
Example:
$ nextflow run cbcrg/lncRNA-Annotation-nf --reads '/home/dataset/*.fastq.gz'
This will handle each fastq file as a seperate sample.
Read pairs of samples can be specified using the glob file pattern. Consider a more complex situation where there are three samples (A, B and C) being paired ended reads. The read files could be:
sample_A_1.fastq.gz
sample_A_2.fastq.gz
sample_B_1.fastq.gz
sample_B_2.fastq.gz
sample_C_1.fastq.gz
sample_C_2.fastq.gz
The reads may be specified as below:
$ nextflow run cbcrg/lncRNA-Annotation-nf --reads '/home/dataset/sample_*_{1,2}.fastq.gz'
--genome
.fa
./tutorial/data/genome/genome.fa
Example
$ nextflow run cbcrg/lncRNA-Annotation.nf --genome /home/genomes/Sscrofa_102.fa
--annotation
./tutorial/annotation/annotation.gtf
Example:
$ nextflow run cbcrg/lncRNA-Annotation-nf --annotation '/home/annotation/Sscrofa_102.gtf'
--overlap
Example:
$ nextflow run cbcrg/lncRNA-Annotation-nf --overlap 100
--output
./results
Example:
$ nextflow run cbcrg/lncRNA-Annotation-nf --output /home/user/my_results
lncRNA-Annotation-NF execution relies on the Nextflow framework which provides an abstraction between the pipeline functional logic and the underlying processing system.
Thus it is possible to execute it on your computer or any cluster resource manager without modifying it.
Currently the following platforms are supported:
By default the pipeline is parallelized by spanning multiple threads in the machine where the script is launched.
To submit the execution to a SGE cluster create a file named nextflow.config
, in the directory
where the pipeline is going to be launched, with the following content:
process {
executor='sge'
queue='<your queue name>'
}
In doing that, tasks will be executed through the qsub
SGE command, and so your pipeline will behave like any
other SGE job script, with the benefit that Nextflow will automatically and transparently manage the tasks
synchronisation, file(s) staging/un-staging, etc.
Alternatively the same declaration can be defined in the file $HOME/.nextflow/config
.
To lean more about the avaible settings and the configuration file read the Nextflow documentation.