EMBL-EBI-TSI/TESK

Name: TESK

Owner: EMBL-EBI Technology & Science Integration

Description: TES on Kubernetes

Created: 2017-09-05 09:55:18.0

Updated: 2018-05-22 08:30:22.0

Pushed: 2018-05-22 08:30:21.0

Homepage: null

Size: 589

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

An implementation of a task execution engine based on the TES standard running on Kubernetes. For more details on TES, see the (very) brief introduction to TES.

For organisational reasons, this project is split into 3 repositories:

If the API is running on your cluster it will pull the images from our gcr.io repository automatically.

TESK is designed with the goal to support any Kubernetes cluster, for its deployment please refer to the deployment page, the instructions provided there can be used in heterogeneous environments, with minimal configuration.

We are also providing some specific instructions for setting up and exposing the TESK service using:

The technical documentation is available in the documentation folder.

Architecture

As a diagram:

TESK architecture

Description: The first pod in the task lifecycle is the API pod, a pod which runs a web server (Tomcat) and exposes the TES specified endpoints. It consumes TES requests, validates them and translates them to Kubernetes jobs. The API pod then creates a task controller pod, or taskmaster.

The taskmaster consumes the executor jobs, inputs and outputs. It first creates filer pod, which creates a persistent volume claim (PVC) to mount as scratch space. All mounts are initialized and all files are downloaded into the locations specified in the TES request; the populated PVC can then be used by each executor pod one after the other. After the filer has finished, the taskmaster goes through the executors and executes them as pods one by one. Note: Each TES task has a separate taskmaster, PVC and executor pods belonging to it; the only 'singleton' pod across tasks is the API pod.

After the last executor, the filer is called once more to process the outputs and push them to remote locations from the PVC. The PVC is the scrubbed, deleted and the taskmaster ends, completing the task.

Requirements

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.