Name: comet
Owner: raphael-group
Description: CoMEt: A Statistical Approach to Identify Combinations of Mutually Exclusive Alterations in Cancer
Created: 2015-02-20 19:43:48.0
Updated: 2017-07-31 05:56:00.0
Pushed: 2018-01-04 21:13:18.0
Homepage: http://compbio.cs.brown.edu/projects/comet/
Size: 454
Language: C
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
CoMEt is a stochastic algorithm for identifying collections of mutually exclusive alterations in cohorts of sequenced tumor samples. CoMEt is written in Python 2.7.x, with required extensions written in C and Fortran. It was developed by the Raphael research group in the Department of Computer Science and Center for Computational Molecular Biology at Brown University.
CoMEt identifies a collection M of t alteration sets, each of size k, from a binary alteration matrix. CoMEt uses a Markov chain Monte Carlo (MCMC) algorithm to sample collections in proportion to their weight φ(M). The output of CoMEt is a list of collections, each with their sampling frequency, weight, and the weight φ(M) of each alteration set M ∈ M.
We also refer you to the cometExactTest R package hosted on CRAN.
CoMEt requires the following Python modules. For each module, the latest version tested with CoMEt is given in parantheses:
CoMEt requires Bower to create web output.
The C and Fortran extensions must be compiled before running CoMEt. To compile the extensions, run the following commands in your terminal:
cd comet/
python setup.py build
This will generate two compiled Python modules – comet/cComet.so
and comet/permute_matrix.so
– which can be imported directly into Python.
The input data for CoMET consists of a:
In all files, lines starting with '#'
are ignored.
We provide example data in example_datasets/
.
We provide two pipelines for performing CoMEt:
run_comet_simple.py
script to run the Markov chain Monte Carlo (MCMC) algorithm on the given mutation matrix. run_comet_simple.py
outputs a JSON file that stores the parameters of the run, a tab-separated file that lists the collections identified by CoMEt (sorted descending by sampling frequency), and a website that can be used to visualize the results.run_comet_full.py
script to perform CoMEt with the same output as the run_comet_simple.py
but with significant test. This pipeline computes the collections with statistical significance and identifies the consensus modules. The output of this pipeline contains a JSON file that stores the parameters of the run, a tab-separated file that lists the collections identified by CoMEt (sorted descending by sampling frequency), and a website that can be used to visualize the results.To view the results website, download the required Javascript files (see Requirements above) and start a Python web server:
cd OUTPUT_DIRECTORY # the output directory you provided to run_comet_simple.py or run_comet_full.py
bower install
python -m SimpleHTTPServer 8000
Then direct your browser to http://localhost:8000
.
We also provide the script run_exhaustive.py
as a simple way to compute the weight φ(M) for all gene sets M in a given dataset (using the same input format as above). The output of run_exhaustive.py
is a tab-separated file that lists the weight φ(M) for all gene sets in the dataset (sorted ascending by φ(M)).
Please visit our Google Group to post questions and view discussions from other users, or contact us through our research group's website.
To test CoMEt, run the following commands:
cd test
python test.py
The tests are successful if the last line of the text printed to the terminal is "PASS"
.
Mark D.M. Leiserson*, Hsin-Ta Wu*, Fabio Vandin, Benjamin J. Raphael. CoMEt: A Statistical Approach to Identify Combinations of Mutually Exclusive Alterations in Cancer. In Proceedings of the 19th Annual Conference on Research in Computational Molecular Biology (RECOMB) 2015. Extended abstract and preprint.
* equal contribution