DataBiosphere/job-manager

Name: job-manager

Owner: Data Biosphere

Description: Job Manager API and UI for interacting with asynchronous batch jobs and workflows.

Created: 2017-08-30 18:51:22.0

Updated: 2018-01-10 20:25:04.0

Pushed: 2018-02-12 19:21:04.0

Homepage:

Size: 867

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Job Manager

This product is in Alpha and not yet ready for production use. We welcome all feedback!

See the development guide below.

The Job Manager is an API and UI for monitoring and managing jobs in a backend execution engine.

The Broad, Verily, and many other organizations in the life sciences execute enormous numbers of scientific workflows and need to manage those operations. Job Manager was born out of the experiences of producing data for some of the world?s largest sequencing projects such as The Cancer Genome Atlas, Baseline, and the Thousand Genomes Project.

The Job Manager aspires to bring ease and efficiency to developing and debugging workflows while seamlessly scaling to production operations management.

Key Features
Future Features
Roadmap

The current code is a work in progress towards an alpha release and as such has started with core features: connecting to both backends, visualizing workflow and task status and metadata, quick access to log files, and simple filtering.

The near-term roadmap includes improvements to failure troubleshooting, creating a robust dashboard for grouping jobs and seeing status overviews, and improving handling of widely scattered workflows.

We envision a product with user-customizable views of jobs running, insights into workflow compute cost, the ability to re-launch jobs, and the potential to make custom reports about the jobs that have been run.

Architecture Overview

The Job Manager defines an API via OpenAPI. An Angular2 UI is provided over the autogenerated Typescript bindings for this API. The UI is configurable at compilation time to support various deployment environments (see environment.ts), including auth, cloud projects, and label columns.

The UI must be deployed along with a backend implementation of the API; two such implementations are provided here:

Cromwell

Monitors jobs launched by the Cromwell workflow engine. The Python Flask wrapper was created using Swagger Codegen and can be configured to pull data from a specific Cromwell instance. The Job Manager currently supports Cromwell version 29.

dsub

Monitors jobs that were launched via the dsub CLI. Thin stateless wrapper around the dsub Python library. Authorization is required for deploying the UI, which is used to communicate with the Google Genomics Pipelines API. The wrapper itself is implemented in Python Flask using Swagger codegen models. A Dockerfile is provided which serves for production deployment using gunicorn.

Note that a ?task? in dsub nomenclature corresponds to a Job Manager API?s ?job?.

Development
Prerequisite

The following commands assume you have symbolically linked your preferred local API backend docker compose file as docker-compose.yml, e.g.:

sf dsub-local-compose.yml docker-compose.yml

Alternatively, use:

er-compose -f dsub-google-compose.yml CMD
Server Setup

For setting up development with dsub see servers/dsub.

For setting up development with cromwell see servers/cromwell.

Run Locally
  1. Run docker-compose up from the root of the repository:
  2. Navigate to http://localhost:4200.
Notes
  1. Websocket reload on code change does not work in docker-compose (see https://github.com/angular/angular-cli/issues/6349).
  2. Changes to package.json or requirements.txt require a rebuild with:
    er-compose up --build
    
    Alternatively, rebuild a single component:
    er-compose build ui
    
Updating the API using swagger-codegen

We use swagger-codegen to automatically implement the API, as defined in api/jobs.yaml, for all servers and the UI. Whenever the API is updated, follow these steps to update the server implementations:

  1. If you do not already have the jar, you can download it here:
    nux
     http://central.maven.org/maven2/io/swagger/swagger-codegen-cli/2.2.3/swagger-codegen-cli-2.2.3.jar -O swagger-codegen-cli.jar
    cOS
     install swagger-codegen
    
  2. Clear out existing generated models:
    i/src/app/shared/model/*
    ervers/dsub/jobs/models/*
    ervers/cromwell/jobs/models/*
    
  3. Regenerate both the python and angular definitions.
     -jar swagger-codegen-cli.jar generate \
    pi/jobs.yaml \
    ypescript-angular2 \
    i/src/app/shared
     -jar swagger-codegen-cli.jar generate \
    pi/jobs.yaml \
    ython-flask \
    ervers/dsub \
    pportPython2=true,packageName=jobs
     -jar swagger-codegen-cli.jar generate \
    pi/jobs.yaml \
    ython-flask \
    ervers/cromwell \
    pportPython2=true,packageName=jobs
    
  4. Update the server implementations to resolve any broken dependencies on old API definitions or implement additional functionality to match the new specs.
Job Manager UI Server

For UI server documentation, see ui.

Job Manager dsub Server

For dsub server documentation, see servers/dsub.

Job Manager cromwell Server

For cromwell server documentation, see servers/cromwell.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.