MicrosoftGenomics/GWAS_benchmark

Name: GWAS_benchmark

Owner: Microsoft Genomics

Description: A set of tools for benchmarking or evaluating GWAS algorithms. A detailed description can be found in C. Widmer et al., Scientific Reports 2014.

Created: 2014-11-11 21:06:47.0

Updated: 2016-08-08 16:25:48.0

Pushed: 2015-09-04 18:25:07.0

Homepage: http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/Fastlmm/

Size: 32115

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

GWAS_benchmark

This python code can be used to benchmark or evaluate GWAS algorithms.

If you use this code, please cite:

See this website for related software:
http://research.microsoft.com/en-us/um/redmond/projects/MSCompBio/

Our documentation (including live examples) is available as ipython notebook: https://github.com/MicrosoftGenomics/GWAS_benchmark/blob/master/GWAS_benchmark/simulation.ipynb

(To start ipython notebook locally, type ipython notebook at the command line.)

This code contains the following modules:

For testing purposes a small data set is provided at data/mouse (see the README file within that directory for the data license).

An example run to compute type I error rate on the mouse data using 10 causal SNPs can be executed by running python run_simulation.py.

We recommend running this example on a cluster computer as this simulation is computationally demanding. An example result plot (of type I error) is provided in the results directory.

Further, we use the ipython-notebook to demonstrate some of the functionality of the hierarchical clustering module: http://nbviewer.ipython.org/github/MicrosoftGenomics/GWAS_benchmark/blob/master/GWAS_benchmark/simulation.ipynb

Quick install:

If you have pip installed, installation is as easy as:

install GWAS_benchmark
Detailed Package Install Instructions:

fastlmm has the following dependencies:

python 2.7

Packages:

(1) Installation of dependent packages

We highly recommend using a python distribution such as Anaconda (https://store.continuum.io/cshop/anaconda/) or Enthought (https://www.enthought.com/products/epd/free/). Both these distributions can be used on linux and Windows, are free for non-commercial use, and optionally include an MKL-compiled distribution for optimal speed. This is the easiest way to get all the required package dependencies.

(2) Installing from source

Go to the directory where you copied the source code for fastlmm.

On linux:

At the shell, type:

 python setup.py install

On Windows:

At the OS command prompt, type

on setup.py install
For developers (and also to run regression tests)

When working on the developer version, just set your PYTHONPATH to point to the directory above the one named GWAS_benchmark in the source code. For e.g. if GWAS_benchmark is in the [somedir] directory, then in the unix shell use:

rt PYTHONPATH=$PYTHONPATH:[somedir]

Or in the Windows DOS terminal, one can use:

PYTHONPATH=%PYTHONPATH%;[somedir]

(or use the Windows GUI for env variables).

Running regression tests

From the directory tests at the top level, run:

on test.py

This will run a series of regression tests, reporting “.” for each one that passes, “F” for each one that does not match up, and “E” for any which produce a run-time error. After they have all run, you should see the string “…………” indicating that they all passed, or if they did not, something such as “….F…E……“, after which you can see the specific errors.

Note that you must set your PYTHONPATH as described above to run the regression tests, and not “python setup.py install”.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.