hammerlab/discohorts

Name: discohorts

Owner: Hammer Lab

Description: Generate Cohorts based on Epidisco and/or Biokepi results

Created: 2017-02-24 21:49:38.0

Updated: 2017-03-24 17:04:10.0

Pushed: 2017-09-25 18:27:17.0

Homepage: null

Size: 46

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

discohorts

Generate Cohorts based on Epidisco and/or Biokepi results

Example Usage
 utils.data import load_cohort
 discohorts import Discohort

rt = load_cohort(min_tumor_mtc=0, min_normal_mtc=0)
_dirs = ["/nfs-pool-{}/biokepi/".format(i) for i in range(2, 17)]
_results_dir = "/nfs-pool/biokepi/results"
rt = Discohort(
cohort,
biokepi_work_dirs=work_dirs,
dest_results_dir=dest_results_dir)

rt.add_epidisco_pipeline(
pipeline_name="epidisco_1")

stomize the normal BAM input.
rt.add_epidisco_pipeline(
pipeline_name="epidisco_2",
config=EpidiscoConfig(cohort,
                      arg_normal_input=lambda patient: patient.normal_sample.bam_path_dna))

stomize the normal BAM input and the run name.
rt.add_epidisco_pipeline(
pipeline_name="epidisco_3",
run_name=lambda patient: "epidisco_{}".format(patient.id),
config=EpidiscoConfig(cohort,
                      arg_normal_input=lambda patient: patient.normal_sample.bam_path_dna))

stomize the normal BAM input, the run name, and which patients to run on.
rt.add_epidisco_pipeline(
pipeline_name="epidisco_4",
run_name=lambda patient: "epidisco_{}".format(patient.id),
config=EpidiscoConfig(cohort,
                      keep=lambda patient: patient.id == "468",
                      arg_normal_input=lambda patient: patient.normal_sample.bam_path_dna))

me as #4, but written differently.
s EpidiscoConfigModified(EpidiscoConfig):
def keep(self, patient):
    return patient.id == "468"
def arg_normal_input(self, patient):
    return patient.tumor_sample.bam_path_dna
fied_config = EpidiscoConfigModified(cohort)
rt.add_epidisco_pipeline(
pipeline_name="epidisco_5",
config=modified_config)

stomize any CLI argument; for example, picard_java_max_heap_size.
rt.add_epidisco_pipeline(
pipeline_name="epidisco_6",
config=EpidiscoConfig(cohort,
                      arg_picard_java_max_heap_size="20g"))

is is Discohort's own dry run functionality, FYI. Should have a better name.
rt.run_pipeline("epidisco_1", dry_run=True)

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.