hackseq/2016_project_8

Name: 2016_project_8

Owner: hackseq

Description: Explore the use of 10x Genomics' Linked-Reads to unlock currently inaccessible parts of the genome

Created: 2016-08-31 22:57:36.0

Updated: 2017-08-04 12:17:42.0

Pushed: 2016-10-18 00:20:37.0

Homepage: null

Size: 251

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

HackSeq 2016 - Project Team 8

Project Team 8 for HackSeq 2016 in Vancouver BC, Canada. The repository contains two projects.

Project Y: Somatic Mutation from Separated Haplotypes (SMUSH) (/somatic/)

Calling somatic mutation from tumor tissues only is challenge not only because you do not have a control to facilitate filtering out germline variants but it is difficult to differentiate low frequency somatic mutation from sequence noise/errors. In this study, we investigate whether we can leverage phasing information from reads to help differentiate somatic variants from germline alterations and sequencing errors.

Code

This repository codebase dependes on 10xGenomics' longranger toolset. Download and install longranger. It also depends on the linked-reads data from 10xGenomics.

count/count.py

Get the counts of alt/ref (hap1, hap2, unphased, chrom, pos, from VCF file.

on count.py [--bed=<bed>] <ref_path> <vcf_path> <bam_path> <output_csv_path>

return value: Writes to disk a CSV file (given by output_csv_path) with columns : alt,chrom,filter,h1_alt,h1_ref,h2_alt,h2_ref,in_bed,pos,ref,un_alt,un_ref

somatic_probability.py

Test run the somatic test on phased allele count data.

on somatic_test <count_file> <result_file>
Project X: Metagenome
Code

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.