hammerlab/cohorts

Name: cohorts

Owner: Hammer Lab

Description: Utilities for analyzing mutations and neoepitopes in patient cohorts

Created: 2016-03-14 15:52:59.0

Updated: 2017-11-02 02:56:44.0

Pushed: 2017-12-30 21:03:49.0

Homepage: null

Size: 560

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

PyPI Build Status Coverage Status

Cohorts

Cohorts is a library for analyzing and plotting clinical data, mutations and neoepitopes in patient cohorts.

It calls out to external libraries like topiary and caches the results for easy manipulation.

Cohorts requires Python 3 (3.3+). We are no longer maintaining compatability with Python 2. For context, see this Python 3 statement.

Installation

You can install Cohorts using pip:

install cohorts
Features

In addition, several other libraries make use of cohorts:

Quick Start

One way to get started using Cohorts is to use it to analyze TCGA data.

As an example, we can create a cohort using query_tcga:

 query_tcga import cohort, config

ovide authentication token
ig.load_config('config.ini')

ad patient data
_patients = cohort.prep_patients(project_name='TCGA-BLCA',
                                 project_data_dir='data')

eate cohort
_cohort = cohort.prep_cohort(patients=blca_patients,
                             cache_dir='data-cache')

Then, use plot_survival() to summarize a potential biomarker (e.g. snv_count) by survival:.

 cohorts.functions import snv_count
_cohort.plot_survival(snv_count, how='os', threshold='median')

Which should produce a summary of results including this plot:

Survival plot example

We could alternatively use plot_benefit() to summarize OS>12mo instead of survival:

_cohort.plot_benefit(snv_count)

Benefit plot example

See the full example in the quick-start notebook

Building from Scratch
ent_1 = Patient(
id="patient_1",
os=70,
pfs=24,
deceased=True,
progressed=True,
benefit=False


ent_2 = Patient(
id="patient_2",
os=100,
pfs=50,
deceased=False,
progressed=True,
benefit=False


rt = Cohort(
patients=[patient_1, patient_2],
cache_dir="/where/cohorts/results/get/saved"


rt.plot_survival(on="os")
ython
le_1_tumor = Sample(
is_tumor=True,
bam_path_dna="/path/to/dna/bam",
bam_path_rna="/path/to/rna/bam"


ent_1 = Patient(
id="patient_1",
...
snv_vcf_paths=["/where/my/mutect/vcfs/live",
               "/where/my/strelka/vcfs/live"]
indel_vcfs_paths=[...],
tumor_sample=sample_1_tumor,
...


rt = Cohort(
...
patients=[patient_1]


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.