Name: CouGaR
Owner: Computational Biology Lab at the University of Toronto
Description: Mini Chromosome
Forked from: misko/minichr
Created: 2015-05-15 18:38:35.0
Updated: 2017-01-12 22:16:48.0
Pushed: 2016-05-13 23:34:10.0
Homepage: null
Size: 211039
Language: C++
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This is a tool used to search for complex genomic rearrangements from matched normal/tumour next generation sequencing data. Final output files created by CouGaR can be visualized using CouGaR-viz
The CouGaR pipeline consists of 5 main stages
A quick example is as follows
ash 02_cluster_mapability.sh `pwd`/TEST_TCGA_5055```
Because BAM files can be extremely large and are not necessary after discordant clusters and coverage are computed they can be easily preprocessed in this first stage. This is how we were able to run CouGaR on so many TCGA samples with very limited storage space (~6TB). For example you can pre-process the BAM files and then re-run downstream analysis without needing them again (unless you change the way clusters or coverage are computed).
Two scripts have been provided to pre-process BAM files.
first of these grabs the tumor and normal BAM files for a specified TCGA sample (requires a valid TCGA access key) and pre-processes it. The second of these scripts performs the pre-processing operation on local BAM files. In the local case you will need to specify which reference genome [hg18/hg19] is used and also assign a group label to this sample.
prunning
-------
his stage CouGaR computes a first pass over the genome to estimate regions of normal copy-count (these are then removed from further analysis). This HMM has transistion probabilities informed by discordant clusters found in the tumor sample only (normal clusters have been removed).
problem formulation and solving
-------
the HMM pass is complete a flow network is created and solved by cs2 . This solution provides the base contigs for the final IP pass.
roblem formulation and solving
-------
g the contigs identified in the previous step CouGaR computes a somewhat minimal subset needed to adequately explain the observed coverage data.
alization
-------
one of the graph output files with [CouGaR-viz](https://github.com/compbio-UofT/CouGaR-viz)