Name: cohorts
Owner: Hammer Lab
Description: Utilities for analyzing mutations and neoepitopes in patient cohorts
Created: 2016-03-14 15:52:59.0
Updated: 2017-11-02 02:56:44.0
Pushed: 2017-12-30 21:03:49.0
Homepage: null
Size: 560
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
Cohorts is a library for analyzing and plotting clinical data, mutations and neoepitopes in patient cohorts.
It calls out to external libraries like topiary and caches the results for easy manipulation.
Cohorts requires Python 3 (3.3+). We are no longer maintaining compatability with Python 2. For context, see this Python 3 statement.
You can install Cohorts using pip:
install cohorts
Cohort
consisting of Patient
s with Sample
s.varcode
and topiary
to generate and cache variant effects and predicted neoantigens.missense_snv_count
, neoantigen_count
, expressed_neoantigen_count
; or create your own functions.lifelines
, response/no response plots (with Mann-Whitney and Fisher's Exact results), ROC curves. Example: cohort.plot_survival(on=missense_snv_count, how="pfs")
.cohort.as_dataframe(join_with=["tcr", "pdl1"])
.In addition, several other libraries make use of cohorts
:
One way to get started using Cohorts is to use it to analyze TCGA data.
As an example, we can create a cohort using query_tcga:
query_tcga import cohort, config
ovide authentication token
ig.load_config('config.ini')
ad patient data
_patients = cohort.prep_patients(project_name='TCGA-BLCA',
project_data_dir='data')
eate cohort
_cohort = cohort.prep_cohort(patients=blca_patients,
cache_dir='data-cache')
Then, use plot_survival()
to summarize a potential biomarker (e.g. snv_count
) by survival:.
cohorts.functions import snv_count
_cohort.plot_survival(snv_count, how='os', threshold='median')
Which should produce a summary of results including this plot:
We could alternatively use plot_benefit()
to summarize OS>12mo instead of survival:
_cohort.plot_benefit(snv_count)
See the full example in the quick-start notebook
ent_1 = Patient(
id="patient_1",
os=70,
pfs=24,
deceased=True,
progressed=True,
benefit=False
ent_2 = Patient(
id="patient_2",
os=100,
pfs=50,
deceased=False,
progressed=True,
benefit=False
rt = Cohort(
patients=[patient_1, patient_2],
cache_dir="/where/cohorts/results/get/saved"
rt.plot_survival(on="os")
ython
le_1_tumor = Sample(
is_tumor=True,
bam_path_dna="/path/to/dna/bam",
bam_path_rna="/path/to/rna/bam"
ent_1 = Patient(
id="patient_1",
...
snv_vcf_paths=["/where/my/mutect/vcfs/live",
"/where/my/strelka/vcfs/live"]
indel_vcfs_paths=[...],
tumor_sample=sample_1_tumor,
...
rt = Cohort(
...
patients=[patient_1]