biostream/bioschemas

Name: bioschemas

Owner: biostream

Description: ga4gh, gdc and bmeg in one place

Created: 2016-11-11 21:37:22.0

Updated: 2017-09-08 20:58:29.0

Pushed: 2017-07-14 22:03:15.0

Homepage:

Size: 129

Language: Protocol Buffer

GitHub Committers

UserMost Recent Commit# Commits
Brian2017-04-19 16:58:44.012
Kyle Ellrott2017-03-28 19:02:07.02
Brian King2017-01-06 21:53:31.012

Other Committers

UserEmailMost Recent Commit# Commits

README

bioschemas

Common data structures and APIs.

This repo contains

packaging

The schemas are packaged into a python module bioschemas The justification for the packaging is threefold:

install git+https://github.com/ohsu-computational-biology/bioschemas
package release
in
ckage-all.sh
 generates schema snapshot ...
 runs setup tests ...
------------------------------------------------------------------
4 tests in 0.100s


usage
oschemas-snapshot --help
e: bioschemas-snapshot [-h] [-o OUTPUT] [-v]

act bioschemas schema directory [ga4gh,bmeg,gdc]

onal arguments:
, --help            show this help message and exit
 OUTPUT, --output OUTPUT
                    Extract to this directory name. Must not already
                    exist; it will be created as well as missing parent
                    directories.
, --version         Print git hashes

The snapshot can be used by any language context and has the following structure:


cerberus
??? bmeg
??? ga4gh
?   ??? ga4gh
?   ??? google
?       ??? api
?       ??? protobuf
??? gdc
jsonschema
??? bmeg
??? ga4gh
?   ??? ga4gh
?   ??? google
?       ??? api
?       ??? protobuf
??? gdc
proto
??? bmeg
??? ga4gh
    ??? ga4gh
    ??? google
        ??? api
python usage
rt  bioschemas

chemas.schema_path()
/home/someuser/bioschemas/bioschemas/snapshot'

schemas.json_schema('Resource')
u'properties': {u'checksum': {u'type': u'string'}, u'class': {u'type': u'string'}, u'created': {u'type': u'string'}, u'datasetID': {u'type': u'string'}, u'description': {u'type': u'string'}, u'format': {u'type': u'string'}, u'gid': {u'type': u'string'}, u'id': {u'type': u'string'}, u'info': {u'type': u'object'}, u'location': {u'type': u'string'}, u'mimeType': {u'type': u'string'}, u'name': {u'type': u'string'}, u'size': {u'type': u'integer'}, u'type': {u'type': u'string'}}, u'type': u'object'}  

schemas.cerberus_schema('Resource')
u'checksum': {u'type': u'string'}, u'class': {u'type': u'string'}, u'created': {u'type': u'string'}, u'datasetID': {u'type': u'string'}, u'description': {u'type': u'string'}, u'format': {u'type': u'string'}, u'gid': {u'type': u'string'}, u'id': {u'type': u'string'}, u'info': {u'type': {u'type': u'dict'}}, u'location': {u'type': u'string'}, u'mimeType': {u'type': u'string'}, u'name': {u'type': u'string'}, u'size': {u'type': u'integer'}, u'type': {u'type': u'string'}}

chemas.git_hashes()
{u'bioschemas': u'f40f653', u'bmeg': u'537f94a', u'created_at': u'2016-11-18T17:47:56.858397Z', u'gdc': u'288f042'}

chemas.gdc_submission_template('file')

u'aliquots': {u'submitter_id': None}, u'analytes': {u'submitter_id': None}, u'archives': {u'submitter_id': None}, u'cases': {u'submitter_id': None}, u'centers': {u'code': None}, u'data_formats': {u'name': None}, u'data_subtypes': {u'name': None}, u'derived_files': {u'submitter_id': None}, u'described_cases': {u'submitter_id': None}, u'experimental_strategies': {u'name': None}, u'file_name': None, u'file_size': None, u'md5sum': None, u'platforms': {u'name': None}, u'portions': {u'submitter_id': None}, u'project_id': None, u'related_files': {u'submitter_id': None}, u'samples': {u'submitter_id': None}, u'slides': {u'submitter_id': None}, u'state_comment': None, u'submitter_id': None, u'tags': {u'name': None}, u'type': u'file'}
utilty

The ga4gh and bmeg cannonical schemas are maintained in protobuf. The bin/custom-plugin.py processes the schemas for alternate uses (jsonschema, cerebus). The bioschemas/snapshot directory contains output from protoc. Please do not hand edit, rather change custom-plugin.py or json-to-cerberus.py


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.