Name: PvKey
Owner: LPM and Collaborators
Description: Somatic variants calling
Created: 2014-03-03 14:13:18.0
Updated: 2016-03-30 20:55:59.0
Pushed: 2014-09-21 11:32:06.0
Size: 9740
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
PvKey is a pipeline that work with Tumor-Normal matched samples. It calls somatic variants using Mutect and structural variants using SVDetect. It can handle genome, exome and targeted samples (TruSeq Custom Amplicon). It is implemented and made possible by the Cosmos workflow management system. Components include:
PvKey is configured in wga_settings.py where it points to the correct paths to the GATK bundle, reference genome, and binaries
Note: on Orchestra the files are placed in the right order, and the WGA directory is available currently under /groups/cbi/02.Public.data/WGA/, it will be moved to /groups/lpm/WGA.
Inside the PvKey directory, execute:
cli -h
.. code-block:: json
[
{
'chunk': 001,
'library': 'LIB-1216301779A',
'platform': 'ILLUMINA',
'platform_unit': 'C0MR3ACXX.001',
'sample_name': 'BC18-06-2013_LyT_S5_L001',
'rgid': 'BC18-06-2013',
'pair': 0, #0 or 1
'path': '/path/to/fastq',
'sample_tye' : 'tumor' or 'normal'
},
{..}
]
Note: If you are working on target resequencing data generated with TruSeq Custom Amplicon assay, add -target True (mark duplicates will not be performed because all the reads are duplicates)
Note: It requires boto plugin
This python script interact with the ILLUMINA repository of ngs data (BaseSpace) to download all the sequenced sample within a project. To make it work you have to import BaseSpacePy. https://github.com/basespace/basespace-python-sdk.git
BaseSpacePy is a Python based SDK to be used in the development of Apps and scripts for working with Illumina's BaseSpace cloud-computing solution for next-gen sequencing data analysis. The primary purpose of the SDK is to provide an easy-to-use Python environment enabling developers to authenticate a user, retrieve data, and upload data/results from their own analysis to BaseSpace.