Duke-GCB/DukeDSClient

Name: DukeDSClient

Owner: Duke Center for Genomic and Computational Biology

Description: Command line tool to upload a folder into a project on the duke-data-service

Created: 2016-02-18 13:31:54.0

Updated: 2016-10-25 18:16:23.0

Pushed: 2018-01-12 21:11:41.0

Homepage:

Size: 2224

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

DukeDSClient

Command line tool to upload/manage project on the duke-data-service.

Build Status Coverage Status Dependency Status

Requirements

The preferred python versions are 2.7.9+ or 3.4.1+ as they have functional ssl modules by default. Older python 2.7 may work by following this guide: Older-python-2.7-setup

Install or Upgrade:

install --upgrade DukeDSClient
Config file setup.

DukeDSClient requires a config file containing an agent_key and a user_key. DukeDSClient supports a global configuration file at /etc/ddsclient.conf and a user configuration file at ~/.ddsclient. Settings in the user configuration file override those in the global configuration. Details of all configuration options: Configuration options.

Follow these instructions to setup your user_key and agent_key:

Instructions for adding agent and user keys to the user config file.

Use:

See general help screen:

lient -h

See help screen for a particular command:

lient <command> -h

All commands take the form:

lient <command> <arguments...>
Upload:
lient upload -p <ProjectName> <Folders/Files...>

This will create a project with the name ProjectName in the duke data service for your user if one doesn't exist. It will then upload the Folders and it's contents to that project. Any items that already exist with the same hash will not be uploaded.

Example: Upload a folder named 'results' to new or existing project named 'Analyzed Mouse RNA':

lient upload -p 'Analyzed Mouse RNA' results
Download:
lient download -p <ProjectName> [Folder]

This will download the contents of ProjectName into the specified folder. Currently it requires the directory be empty or not exist. It will create Folder if it doesn't exist. If Folder is not specified it will use the name of the project with spaces translated to '_'.

Example: Download the contents of project named 'Mouse RNA' into '/tmp/mouserna' :

lient download -p 'Mouse RNA' /tmp/mouserna
Add User To Project:
Using duke netid:
lient add_user -p <ProjectName> --user <Username> --auth_role 'project_admin'

Example: Grant permission to user with username 'jpb123' for a project named 'Analyzed Mouse RNA' with default permissions:

lient add_user -p 'Analyzed Mouse RNA' --user 'jpb123'
Using email:
lient add_user -p <ProjectName> --email <Username> --auth_role 'project_admin'

Example: Grant permission to user with email 'ada.lovelace@duke.edu' for a project named 'Analyzed Mouse RNA' with default permissions:

lient add_user -p 'Analyzed Mouse RNA' --email 'ada.lovelace@duke.edu'
Developer:

Install dependencies:

install -r devRequirements.txt 

Setup pre-commit hook:

re-commit.sh .git/hooks/pre-commit

Run linter/style checker:

e8 --ignore E501 ddsc/

Run the tests

on setup.py test
Data Service Web Portal:

Duke Data Service Portal. This also requires a Duke NetID.

Upload Settings

The default upload settings is to use a worker per cpu and upload 100MB chunks. You can change this via the upload_bytes_per_chunk and upload_workers config file options. These options should be added to your ~/.ddsclient config file. upload_workers should be an integer for the number of upload workers you want. upload_bytes_per_chunk is the size of chunks to upload. Specify this with MB extension.

Example config file setup to use 4 workers and 200MB chunks:

ad_workers: 4
ad_bytes_per_chunk: 200MB
Alternate Service:

The default url is https://api.dataservice.duke.edu/api/v1. You can customize this via the url config file option. Example config file setup to use the uatest server:

 https://apiuatest.dataservice.duke.edu/api/v1

You also can specify an alternate url for use with ddsclient via the DUKE_DATA_SERVICE_URL environment variable. Here is how you can set the environment variable so ddsclient will connect to the 'dev' url:

rt DUKE_DATA_SERVICE_URL='https://apidev.dataservice.duke.edu/api/v1'

This will require using the associated portal to get a valid keys.

You will need to specify an agent_key and user_key in the config file appropriate for the particular service.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.