DataBiosphere/commons-sample-data

Name: commons-sample-data

Owner: Data Biosphere

Description: A repo to track various TOPMed and other datasets

Created: 2018-01-19 06:03:30.0

Updated: 2018-02-07 19:47:31.0

Pushed: 2018-02-01 00:59:57.0

Homepage: null

Size: 34

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

commons-sample-data

A repo to track various TOPMed and other datasets.

TOPMed Open Access

This dataset (~100 WGS) was provided by Goncalo's team (Jonathon LeFaive), the original manifest can be found here:

gs://topmed-irc-share/public/TOPMed.public_samples.manifest.2017.11.30.txt

I then replicated this data in AWS and GCP in public buckets to make it easier to share with collaborators for testing.

AWS

See the TOPMed.aws.public_samples.manifest.2017.11.30.txt for the locations of the cram and index files on AWS.

GCP

See the TOPMed.gcp.public_samples.manifest.2017.11.30.txt for the locations of the cram and index files on GCP.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.