soedinglab/gxpred

Name: gxpred

Owner: Söding Lab

Description: Development branch of GxPRED

Created: 2017-10-11 11:42:32.0

Updated: 2017-11-22 11:53:25.0

Pushed: 2017-10-30 13:40:36.0

Homepage:

Size: 1720

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

GxPRED

This is the development branch of Gene Expression Prediction Algorithm, under the production name GxPRED.

What is this repository for?
How to run?

GxPRED provides core utilities / API for learning and predicting gene expression levels from genotype. Examples are shown in the base directory for learning from gEUVADIS data (learn_from_geuvadis.py) and predicting on GTEx data (predict_on_gtex.py). You can adapt these files for any dataset.

CODEBASE = "/path/to/this/directory"
TRAINVCF = "/path/to/gz/vcf/file/for/training"
TRAINRPKM = "/path/to/normalized/gene/expression/file/for/training"
TRAINGTF = "/path/to/gtf/file"
CHROM = "21" # change it to whichever chromosome you are interested 
PREDVCF = "/path/to/gz/vcf/for/prediction/samples"
MODELDIR = "/path/to/directory/where/model/will/be/saved"

python ${CODEBASE}/learn_from_geuvadis.py   --vcf ${TRAINVCF} --rpkm ${TRAINRPKM} --gtf ${TRAINGTF} --chrom ${CHROM} --params 0.01 0.0 0.01 0.001 0.001 --outdir ${MODELDIR}
python ${CODEBASE}/predict_on_gtex.py --vcf ${PREDVCF}  --model ${MODELDIR} --chrom ${CHROM} --outprefix predicted_gene_expression

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.