Name: bioapi-examples
Owner: BD2K Center for Translational Genomics
Description: Code examples for working with web based bioinformatics APIs
Created: 2016-10-10 23:43:58.0
Updated: 2017-08-08 06:50:06.0
Pushed: 2017-02-17 18:51:24.0
Size: 2218
Language: Jupyter Notebook
GitHub Committers
User | Most Recent Commit | # Commits |
---|---|---|
Brian | 2016-10-12 18:37:47.0 | 4 |
Check your git settings! | 2016-07-04 19:39:03.0 | 2 |
Sean Upchurch | 2017-01-31 15:53:16.0 | 5 |
David Steinberg | 2017-02-01 17:28:24.0 | 68 |
Maciek Smuga-Otto | 2016-02-26 19:29:26.0 | 2 |
Kunal Dhillon | 2016-08-05 23:53:56.0 | 15 |
Alex Doria | 2016-07-05 17:53:32.0 | 1 |
Nicholas Hill | 2016-08-18 23:57:59.0 | 5 |
Kevin Osborn | 2016-10-12 21:56:26.0 | 4 |
achave11 | 2016-12-06 05:53:03.0 | 31 |
Other Committers
User | Most Recent Commit | # Commits | |
---|---|---|---|
Abraham C | abrahamc@abrahams-mbp.attlocal.net | 2016-07-04 19:21:31.0 | 4 |
Abraham C | abrahamc@eduroam-169-233-195-59.ucsc.edu | 2016-05-21 21:27:26.0 | 1 |
Abraham C | abrahamc@eduroam-169-233-244-242.ucsc.edu | 2016-07-01 16:55:09.0 | 1 |
Abraham Chavez | achave11@eduroam-169-233-218-31.ucsc.edu | 2016-05-09 21:36:14.0 | 1 |
Data sharing efforts and readily available computing resources are making bioinformatics over the Web possible. In the past, siloed data stores and obscure file formats made it difficult to synthesize and reproduce results between institutions. Some familiarity with python is expected.
There are 3 sections of example code in this repository.
Get Started!
First install the ga4gh client module. It is best to do the install inside of a virtual environment.
ualenv ga4gh-client
ce ga4gh-client/bin/activate
install --no-cache-dir --pre ga4gh_client
This environment is suitable for running any of the python notebooks listed in the python_notebook directory. There is a directory of which GA4GH APIs are demonstrated by each example that can be found on the wiki https://github.com/BD2KGenomics/bioapi-examples/wiki
Examples of scripts that work against existing services.
Get started!
install -r requirements.txt
on hello_ga4gh.py
GA4GH aims to standardize how bioinformatics data are shared over the web. A reference server with a subset of publicly available test data from 1000 genomes has been made available for these examples.
The GA4GH reference server hosts bioinformatics data using an HTTP API. These data are backed by BAM and VCF files. For these examples we will only be accessing a GA4GH server, but it is open source and eager individuals can create their own server instance using these instructions.
HTTP APIs allow web browsers and command line clients to use the same communication layer to transmit data to a server. A client can GET
a resource from a server, POST
a resource on a server, or DELETE
amongst other things.
The documents that servers and clients pass back and forth are often in JavaScript Object Notation (JSON), which can flexibly describe complex data structures. For example, a variant in GA4GH is returned as a document with the form:
{
"alternateBases": ["T"],
"calls": [],
"created": 1455236057000,
"end": 4530,
"id": "YnJjYTE6MWtnUGhhc2UzOnJlZl9icmNhMTo0NTI5OjllNjRkMDIzOTc5NzQ3M2MyNjk2NzFiNzczMjg1MWNj",
"info": {},
"referenceBases": "C",
"referenceName": "ref_brca1",
"start": 4529,
"updated": 1455236057000,
"variantSetId": "YnJjYTE6MWtnUGhhc2Uz"
}
JSON uses strings as keys for values that could be strings, numbers, or arrays and maps of more complex objects.