compbio-UofT/CouGaR-viz

Name: CouGaR-viz

Owner: Computational Biology Lab at the University of Toronto

Description: null

Forked from: misko/CouGaR-viz

Created: 2015-05-15 19:38:25.0

Updated: 2018-03-22 23:19:17.0

Pushed: 2016-05-18 07:06:54.0

Homepage: null

Size: 2541

Language: Racket

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

CouGaR-viz (0.1)

This is a visualization package for laying out complex genomic rearrangements. More specifically focusing on those occuring in amplified regions. Special thanks to Gary Baumgartner (@gfb) for helping get it started in Racket.

Running

./gracket-text cougar-viz.rkt

Input
Gene annotations file

Please see sample files in “gene_annotations”

Genomic regions file

The genomic regions file is made up from two lines, the first describing germ line regions (edges) and the second describing somatic linkings found in the tumour. Each edge is described by two coordinates (from , to) and 4 additional values (edge type [0-4], copy-count, germ-line coverage, somatic coverage).

As an example input the following can be used,

Output image

The output image produced is an SVG format image depicting the genomic regions and somatic linkings as a graph structure. The width of edges will be proportional to the log of the copy count in the input file. Genes are annotated on genomic intervals and are coloured for positive (green'ish) and negative (purple) strand genes.

Red lines represent parts of the reference genome and the thickness of the red line is proportional to the log predicted copy count. Blue lines represent tumor adjacencies and the thickness of the blue lines is proportional to the log predicted copy count of the tumor adjacency in the tumor genome. The copy counts for the tumor adjacencies represents the number of times the two regions appear adjacent in the tumor genome.

The direction of the arrows connecting the adjacency represents the strand connectivity.
>--> there is no change of strand through the adjacency
>--< the adjacency can only be traversed from the left breakpoint on the positive strand or the right breakpoint on the negative strand
<--> the adjacency can only be traversed from the left breakpoint on the negative strand or the right breakpoint on the positive strand

The numbers on the red line represent length of the genomic interval. The numbers above the red line is the predicted copy count of the region in the tumor genome. The images are not to genomic scale as it was impossible to keep genomic scale and still represent these graphs in a reasonable format..

Genomic intervals are laid out in order of appearance in the genome (i.e. , chr5:5,000 before chr:7,000 and chr2 before chr3).


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.