BD2KGenomics/bioapi-examples

Name: bioapi-examples

Owner: BD2K Center for Translational Genomics

Description: Code examples for working with web based bioinformatics APIs

Created: 2016-10-10 23:43:58.0

Updated: 2017-08-08 06:50:06.0

Pushed: 2017-02-17 18:51:24.0

Homepage:

Size: 2218

Language: Jupyter Notebook

GitHub Committers

UserMost Recent Commit# Commits
Brian2016-10-12 18:37:47.04
Check your git settings!2016-07-04 19:39:03.02
Sean Upchurch2017-01-31 15:53:16.05
David Steinberg2017-02-01 17:28:24.068
Maciek Smuga-Otto2016-02-26 19:29:26.02
Kunal Dhillon2016-08-05 23:53:56.015
Alex Doria2016-07-05 17:53:32.01
Nicholas Hill2016-08-18 23:57:59.05
Kevin Osborn2016-10-12 21:56:26.04
achave112016-12-06 05:53:03.031

Other Committers

UserEmailMost Recent Commit# Commits
Abraham Cabrahamc@abrahams-mbp.attlocal.net2016-07-04 19:21:31.04
Abraham Cabrahamc@eduroam-169-233-195-59.ucsc.edu2016-05-21 21:27:26.01
Abraham Cabrahamc@eduroam-169-233-244-242.ucsc.edu2016-07-01 16:55:09.01
Abraham Chavezachave11@eduroam-169-233-218-31.ucsc.edu2016-05-09 21:36:14.01

README

Biomedicine API Examples

Introduction

Data sharing efforts and readily available computing resources are making bioinformatics over the Web possible. In the past, siloed data stores and obscure file formats made it difficult to synthesize and reproduce results between institutions. Some familiarity with python is expected.

There are 3 sections of example code in this repository.

  1. Python Notebooks - collection of python notebooks that demonstrate GA4GH APIs
  2. Python Scripts - A set of scripts that demonstrate biomedicine APIs that use GA4GH APIs
  3. Variant Browser - An example visualization of variants that uses GA4GH APIs
Python Notebooks

Get Started!

First install the ga4gh client module. It is best to do the install inside of a virtual environment.

ualenv ga4gh-client
ce ga4gh-client/bin/activate
install --no-cache-dir --pre ga4gh_client

This environment is suitable for running any of the python notebooks listed in the python_notebook directory. There is a directory of which GA4GH APIs are demonstrated by each example that can be found on the wiki https://github.com/BD2KGenomics/bioapi-examples/wiki

Python Scripts

Examples of scripts that work against existing services.

Get started!

install -r requirements.txt
on hello_ga4gh.py
GA4GH

GA4GH aims to standardize how bioinformatics data are shared over the web. A reference server with a subset of publicly available test data from 1000 genomes has been made available for these examples.

The GA4GH reference server hosts bioinformatics data using an HTTP API. These data are backed by BAM and VCF files. For these examples we will only be accessing a GA4GH server, but it is open source and eager individuals can create their own server instance using these instructions.

What is an HTTP API

HTTP APIs allow web browsers and command line clients to use the same communication layer to transmit data to a server. A client can GET a resource from a server, POST a resource on a server, or DELETE amongst other things.

The documents that servers and clients pass back and forth are often in JavaScript Object Notation (JSON), which can flexibly describe complex data structures. For example, a variant in GA4GH is returned as a document with the form:

{
    "alternateBases": ["T"],
    "calls": [],
    "created": 1455236057000,
    "end": 4530,
    "id": "YnJjYTE6MWtnUGhhc2UzOnJlZl9icmNhMTo0NTI5OjllNjRkMDIzOTc5NzQ3M2MyNjk2NzFiNzczMjg1MWNj",
    "info": {},
    "referenceBases": "C",
    "referenceName": "ref_brca1",
    "start": 4529,
    "updated": 1455236057000,
    "variantSetId": "YnJjYTE6MWtnUGhhc2Uz"
}

JSON uses strings as keys for values that could be strings, numbers, or arrays and maps of more complex objects.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.