Sage-Bionetworks/metadata-schema

Name: metadata-schema

Owner: Sage Bionetworks

Description: This repo is for the metadata schemas associated with the HCA

Forked from: HumanCellAtlas/metadata-schema

Created: 2018-03-20 21:32:23.0

Updated: 2018-03-20 21:32:25.0

Pushed: 2018-03-21 03:07:04.0

Homepage: null

Size: 19800

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Build Status

The Human Cell Atlas Metadata Schema

This repo contains the HCA metadata JSON schemas, example JSON files, and template metadata spreadsheets.

The metadata design principles can be read in the Metadata schema lifecycle doc.

Details on how to contribute to the metadata schema are described in the contributing.md doc.

HCA v4.6.1-to-v5.0.0 metadata schema changes overview
Primary goals of changes
  1. Move to process-based schema for handling transitions between core biomaterial and file entities
  2. Move to a module-based schema to support independent versioning and user-/domain-specific metadata fields
  3. Move to a more flexible, reusable metadata structure
Proposed organisational structure Suggested directory structure of schemas
/
biomaterial/biomaterial_core.json   
file/file_core.json
process/process_core.json
project/project_core.json
protocol/protocol_core.json

/
process/    
    analysis/   analysis_process.json
    biomaterial_collection/ enrichment_process.json
                            collection_process.json
                            dissociation_process.json
    imaging/    imaging_process.json
    sequencing/ library_preparation_process.json
                sequencing_process.json
protocol/  
    analysis/    analysis_protocol.json
    biomaterial/ biomaterial_collection_protocol.json
    imaging/     imaging_protocol.json
    sequencing/  sequencing_protocol.json

biomaterial/
    cell_line.json
    cell_suspension.json
    organism.json
    organoid.json
    specimen_from_organism.json
file/       
    sequence_file.json
project/    
    project.json

le/
biomaterial/
    death.json
    ...
ontology/
    body_part_ontology.json
    ...
process/
    sequencing/
        barcode.json
        ...
    imaging/
        ...
project/
    contact.json
    publication.json
    ...
Specifying version info

Each schema should be self describing using id field with a URL to the location of the version of the current document.

Version indicated in schema URL: https://schema.humancellatlas.org/core/biomaterial/5.0.0/biomaterial_core

As we are requiring instance data to also be self describing, all types will require a property called $schema.

e.g. For donor_organism.json schema, these fields will look like:

hema": "http://json-schema.org/draft-04/schema#"
: "https://schema.humancellatlas.org/type/biomaterial/4.0.0/donor_organism"
itionalProperties": false,
perties" : {
"describedBy": {
    "description": "The URL reference to the schema.",
    "type": "string",
    "pattern": "https://schema.humancellatlas.org/type/biomaterial/[0-9]{1,}.[0-9]{1,}.[0-9]{1,}/donor_organism"
},
...


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.