biocore/American-Gut

Name: American-Gut

Owner: biocore

Description: American Gut open-access data and IPython notebooks

Created: 2013-10-01 23:39:04.0

Updated: 2017-12-27 17:59:14.0

Pushed: 2017-03-17 17:29:28.0

Homepage: null

Size: 426646

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

American-Gut

American Gut open-access code and IPython notebooks

A note about data

American Gut sequences and metadata are deposited in The European Bioinformatics Institute under the accession ERP012803.

Bloom sequences found in the data repository are correct and up to date.

OTU tables and mapping files hosted in this repository reflects the state of the project in May 2015 and before. This includes an earlier version of the American Gut survey and dietary questionnaire. Data in GitHub has been scrubbed for PHI. A listing of processed data with the new survey can be found at ftp://ftp.microbio.me/AmericanGut.

The latest OTU tables and precalculated diversity comparisons generated by the primary processing notebook set can be found at ftp://ftp.microbio.me/AmericanGut/latest.

======= American Gut open-access data and IPython notebooks

INSTALL

Basics

American-Gut repository is intended to be used as a project/repo meaning there is no need to install it (ignore setup.py at the moment).

After cloning the repository and before using the scripts user should install necessary dependencies. Two approaches are supported at the moment.

Conda based

If you're choice of package manager is conda dependencies can be installed with

nda install --file ./conda_requirements.txt
p install -r ./pip_requirements.txt

If you would like to install dependencies within a conda environment be sure to change to the appropriate environment prior to the installation of dependencies.

Note: Be aware that with pip some libraries will have to be compiled from source so appropriate system libraries should be installed prior to running the pip command. For more details take a look at Supported Systems section.

PIP based
p install numpy==1.9.2
p install -r ./pip_requirements.txt

If you would like to install dependencies within a virtualenv environment be sure to change to the appropriate environment prior to the installation of dependencies.

Note: Be aware that with pip some libraries will have to be compiled from source so appropriate system libraries should be installed prior to running the pip command. For more details take a look at Supported Systems section.

Supported Operating Systems / Distributions
Debian 8

Tested with Debian 8.3.0 (amd64).

To compile dependencies from source appropriate libraries can be installed (as root/sudo) with

t/sudo)$ aptitude install pkg-config libxslt1-dev libxml2 libfreetype6 \
build-essential python-pip python-dev liblapack-dev liblapack3 \
libfreetype6-dev libblas-dev libblas3 gfortran libhdf5-serial-dev libsm6

RUN

Basics

Although American-Gut repo provides separate scripts (scripts folder) and a package (americangut folder) it is primarily intended to be used through notebooks (ipynb folder).

There are a few environment variable that can be used to customize the run:

To generate reports (pdfs) a TeX distribution should be installed on the system.

Adjusting environment on POSIX systems

Since American-Gut repo contains scripts and packages we need to adjust PYTHONPATH and PATH to reflect this. Therefore, prior to working with notebooks execute the following from within the American-Gut repo:

=`pwd`
port PYTHONPATH=$REPO/:$PYTHONPATH
port PATH=$REPO/scripts:$PATH

If needed adjust AG_* environment variables from Basics section.

Run notebooks

Notebooks are written in two formats and therefore require different profiles.

Markdown based notebooks

Markdown based notebooks can be found in ./ipynb/primary-processing/ folder and have extension md. To use these notebooks we first need to create a profile for ag_ipymd with

ython profile create ag_ipymd

and adjust newly created /path/to/.ipython/profile_ag_ipymd/ipython_notebook_config.py by adding

---------------------
ymd
---------------------
tebookApp.contents_manager_class = 'ipymd.IPymdContentsManager'

to the end of the file.

Now, we can start ipython with

ython notebook --profile=ag_ipymd

and visit the newly started notebook server by going to http://localhost:8888

Jupyter/IPython based notebooks

Notebooks in native notebook format (ipynb) can be found in ./ipynb/ folder and have the extension ipynb. To use these notebooks we first need to create a profile for ag_default with

ython profile create ag_default

Now, we can start ipython with

ython --profile=ag_default notebook

and visit the newly started notebook server by going to http://localhost:8888


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.