cBioPortal/CPTAC-proteomics-pipeline

Name: CPTAC-proteomics-pipeline

Owner: cBioPortal

Description: The research and processing repo for import of CPTAC data into cBioPortal

Created: 2016-06-07 13:21:29.0

Updated: 2017-12-10 21:23:08.0

Pushed: 2017-12-10 23:59:31.0

Homepage: null

Size: 45809

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits
Pamela Wu2018-01-10 22:58:34.018

Other Committers

UserEmailMost Recent Commit# Commits

README

CPTAC-proteomics-pipeline

This is a repository for all of the data processing scripts for the transfer of CPTAC data into cBioPortal as part of 2016's Google Summer of Code. The purpose of this is not only to produce flat text files for import into the cBioPortal database, but it's also to do data exploration and cross-dataset normalization. This has been incorporated into the cBioPortal visualization interface.

Usage

This is a pretty specific package, so we designed it so that it was easy to use on-the-fly. First, clone the repo and cd in:

git clone https://github.com/cBioPortal/CPTAC-proteomics-pipeline.git
cd CPTAC-proteomics-pipeline

If you would like to have all the CPTAC files we used, please run the wget script:

./wget.sh

Please visit the tutorial, which goes through all the elements of the API.

NOTE: As shown in the tutorial, to import the classes, just add the relative location of the ms2cbioportal.py script to your current working directory. For example, since the tutorial is nested inside the repo:

import sys
sys.path.append('../')
Acknowledgements

Thanks to my PI David Fenyo and the GSoC mentors at MSKCC, JJ Gao and Zack Heins, for guidance.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.