FredHutch/ls2_tools

Name: ls2_tools

Owner: Fred Hutchinson Cancer Research Center

Description: BioInformatics Tools that do not require R or Python

Created: 2018-01-26 17:56:27.0

Updated: 2018-01-26 19:40:06.0

Pushed: 2018-01-26 19:40:04.0

Homepage: null

Size: 12

Language: Shell

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

ls2

Life Sciences Software

Overview

The Life Sciences Software (LS2) project aims to normalize the build of software packages across multiple technologies.

Components

LS2 is a collection of open source components:

LS2 Architecture

This is the hierarchy of LS2 containers:

Name/Repo | FROM | Reason | Notes — | — | — | — https://github.com/FredHutch/ls2_ubuntu | ubuntu | simple 'freeze' of the public ubuntu container | OS pkgs added: bash, curl, git https://github.com/FredHutch/ls2_easybuild | ls2_ubuntu | Adding EasyBuild and Lmod | OS pkgs added: python, lua https://github.com/FredHutch/ls2_easybuild_foss | ls2_easybuild | Adding the 'foss' toolchain | OS pkgs added: libibverbs-dev, lib6c-dev, bzip2, unzip, make, xz-utils https://github.com/FredHutch/ls2 | ls2_easybuild_foss | This 'demo' repo | does not produce a container directly https://github.com/FredHutch/ls2_r | ls2_easybuild_foss | Our 'R' build | OS pkgs added: awscli

Tags

In general, tagging goes: fredhutch/ls2_<package name>:<package version>[_<date>]

Ex: fredhutch/ls2_r:3.4.3 or fredhutch/ls2_ubuntu:16.04_20180118

Package versions should generally be the released version, and use the optional 'date' area for private sub-versions.

Git tags and container tags should match.

Container Architecture
Use Cases
Create a container

The initial reason for LS2 is to create Docker containers with EasyBuilt software packages to mirror those available on our HPC systems. We realize that containerizing common software packages will be key in leveraging many new technologies like AWS Batch.

The intention is to use multiple LS2 containers in step to achieve a pipeline. Having the same software packages compiled in the same ways as we have deployed to our traditional HPC cluster enables users to focus on pipeline building and not software troubleshooting when moving to different compute methods.

EasyBuild testing

Building or updating an EasyConfig can be time-consuming. Many existing technologies help to automated the docker build process, so LS2 opens these up to EasyBuild. Even if you run EasyBuild traditionally to deploy built packages to a private directory or shared software archive, you can test those EasyConfigs in an LS2 build to ensure your production build will be successful.

As we are building in a container with minimal installed packages, it is easy to find OS dependencies that are unstate in EasyConfigs. Some examples of this range from pkg-config (a default package in CentOS but not Ubuntu) to utilities like unzip and bzip2. Some dependencies are intentional like OpenSSL (better to pull presumably-updated OS packages than possibly stale EasyBuild packages) and some are oversights easy to miss when you are building in a fully-installed OS (ex: make is not present in the foss-n toolchains).

Manage a traditional archive

We use EasyBuild to install software packages onto an NFS volume. This volume is then shared to our HPC and other systems to enable software package use on those platforms. LS2 can still be used to deploy packages in this way by mounting the NFS volume into the container and performing a build. This process isolates the EasyConfig development process from your live package archive or volume.

HOWTO

There are two sections here. First case covers building an existing or new EasyConfig, and the second covers using a built container to deploy a software package to an existing software archive or volume.

First Copy and Edit

Steps to build a new LS2 container are pretty straight-forward, but assume some knowledge of EasyBuild, Lmod, and Docker.

Copy this repo per these instructions:

  1. create a new repo in github and do not pre-populate with README.md - this should get you the 'Quick setup' page
  2. git clone --bare https://github.com/FredHutch/ls2.git (or git clone --bare ssh://git@github.com/FredHutch/ls2.git)
  3. cd ls2.git
  4. edit README.md
  5. git push --mirror https://github.com/<new repo URL.git>
  6. cd ..
  7. rm -rf ls2.git
  8. git clone <new_repo>
  9. cd <new_repo>
  10. git submodule init
  11. git submodule update --remote

At this point, there are two options:

Second Add to /app (FYI - Work In Progress 01.24.2018)

We keep our deployed software package on an NFS volume that we mount at /app on our systems (can you guess why LS2 builds into /app rather than .local in the container?). In order to use your recently build LS2 software package container to deploy the same package into our /app NFS volume, use these steps:

  1. Complete 'Copy and Edit' steps to produce a successful container with your software package
  2. Run docker build . -f Dockerfile.deploy -t <tag>_fh_deploy once again - this will run quickly and build a second container
  3. Run that container with our package deploy location mapped in to /app like this: docker run ls2_r_fh_deploy -v /app:/app

The steps above will produce a container with EasyBuild and all the pieces necessary, with the actual EasyBuild command set as the entrypoint. Running the container will trigger the EasyBuild run, and the resulting output will be placed into the /app volume outside the container.

Note that this overrides the Lmod in the container, so if version parity is important to you, you'll want to keep your Lmod in sync with the LS2 Lmod.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.