puckel/docker-airflow

Name: docker-airflow

Description: Docker Apache Airflow

Created: 2015-06-10 09:19:44.0

Updated: 2018-01-19 09:19:59.0

Pushed: 2018-01-17 22:14:46.0

Homepage:

Size: 122

Language: Shell

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

docker-airflow

CircleCI branch Docker Build Status

Docker Hub Docker Pulls Docker Stars

This repository contains Dockerfile of apache-airflow for Docker's automated build published to the public Docker Hub Registry.

Informations

/!\ If you want to use Airflow using Python 2, use TAG 1.8.1

Installation

Pull the image from the Docker repository.

    docker pull puckel/docker-airflow
Build

For example, if you need to install Extra Packages, edit the Dockerfile and then build it.

    docker build --rm -t puckel/docker-airflow .
Usage

By default, docker-airflow runs Airflow with SequentialExecutor :

    docker run -d -p 8080:8080 puckel/docker-airflow

If you want to run another executor, use the other docker-compose.yml files provided in this repository.

For LocalExecutor :

    docker-compose -f docker-compose-LocalExecutor.yml up -d

For CeleryExecutor :

    docker-compose -f docker-compose-CeleryExecutor.yml up -d

NB : If you don't want to have DAGs example loaded (default=True), you've to set the following environment variable :

LOAD_EX=n

    docker run -d -p 8080:8080 -e LOAD_EX=n puckel/docker-airflow

If you want to use Ad hoc query, make sure you've configured connections: Go to Admin -> Connections and Edit “postgres_default” set this values (equivalent to values in airflow.cfg/docker-compose*.yml) :

For encrypted connection passwords (in Local or Celery Executor), you must have the same fernet_key. By default docker-airflow generates the fernet_key at startup, you have to set an environment variable in the docker-compose (ie: docker-compose-LocalExecutor.yml) file to set the same key accross containers. To generate a fernet_key :

    python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print FERNET_KEY"
Configurating Airflow

It is possible to set any configuration value for Airflow from environment variables, which are used over values from the airflow.cfg. The general rule is the environment variable should be named AIRFLOW__<section>__<key>, for example AIRFLOW__CORE__SQL_ALCHEMY_CONN sets the sql_alchemy_conn config option in the [core] section.

Check out the Airflow documentation for more details

You can also define connections via environment variables by prefixing them with AIRFLOW_CONN_ - for example AIRFLOW_CONN_POSTGRES_MASTER=postgres://user:password@localhost:5432/master for a connection called “postgres_master”. The value is parsed as a URI. This will work for hooks etc, but won't show up in the “Ad-hoc Query” section unless an (empty) connection is also created in the DB

Custom Airflow plugins

Airflow allows for custom user-created plugins which are typically found in ${AIRFLOW_HOME}/plugins folder. Documentation on plugins can be found here

In order to incorporate plugins into your docker container

Install custom python package
UI Links
Scale the number of workers

Easy scaling using docker-compose:

    docker-compose scale worker=5

This can be used to scale to a multi node setup using docker swarm.

Running other airflow commands

If you want to run other airflow sub-commands, such as list_dags or clear you can do so like this:

    docker run --rm -ti puckel/docker-airflow airflow list_dags

or with your docker-compose set up like this:

    docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow list_dags

You can also use this to run a bash shell or any other command in the same environment that airflow would be run in:

      docker run --rm -ti puckel/docker-airflow bash
      docker run --rm -ti puckel/docker-airflow ipython

Wanna help?

Fork, improve and PR. ;-)


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.