Name: batchspawner
Owner: JupyterHub
Description: Custom Spawner for Jupyterhub to start servers in batch scheduled systems
Created: 2015-10-27 03:53:10.0
Updated: 2018-05-22 14:28:53.0
Pushed: 2018-05-24 13:19:36.0
Size: 117
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
This is a custom spawner for Jupyterhub that is designed for installations on clusters using batch scheduling software.
This began as a generalization of mkgilbert's batchspawner which in turn was inspired by [Andrea Zonca's blog post](http://zonca.github.io/2015/04/jupyterhub-hpc.html 'Run jupyterhub on a Supercomputer') where he explains his implementation for a spawner that uses SSH and Torque. His github repo is found [here](http://www.github.com/zonca/remotespawner 'RemoteSpawner').
This package formerly included WrapSpawner and ProfilesSpawner, which provide mechanisms for runtime configuration of spawners. These have been split out and moved to the wrapspawner
package.
from root directory of this repo (where setup.py is), run pip install -e .
If you don't actually need an editable version, you can simply run
pip install batchspawner
add lines in jupyterhub_config.py for the spawner you intend to use, e.g.
= get_config()
JupyterHub.spawner_class = 'batchspawner.TorqueSpawner'
Depending on the spawner, additional configuration will likely be needed.
This file contains an abstraction layer for batch job queueing systems (BatchSpawnerBase
), and implements
Jupyterhub spawners for Torque, Moab, SLURM, SGE, HTCondor, LSF, and eventually others.
Common attributes of batch submission / resource manager environments will include notions of:
BatchSpawnerBase
provides several general mechanisms:
req_foo
that are exposed as {foo}
in job template scripts. Templates (submit scripts in particular) may also use the full power of jinja2. Templates are automatically detected if a {{
or {%
is present, otherwise str.format() used.Every effort has been made to accomodate highly diverse systems through configuration only. This example consists of the (lightly edited) configuration used by the author to run Jupyter notebooks on an academic supercomputer cluster.
Select the Torque backend and increase the timeout since batch jobs may take time to start
.JupyterHub.spawner_class = 'batchspawner.TorqueSpawner'
.Spawner.http_timeout = 120
------------------------------------------------------------------------------
BatchSpawnerBase configuration
These are simply setting parameters used in the job script template below
------------------------------------------------------------------------------
.BatchSpawnerBase.req_nprocs = '2'
.BatchSpawnerBase.req_queue = 'mesabi'
.BatchSpawnerBase.req_host = 'mesabi.xyz.edu'
.BatchSpawnerBase.req_runtime = '12:00:00'
.BatchSpawnerBase.req_memory = '4gb'
------------------------------------------------------------------------------
TorqueSpawner configuration
The script below is nearly identical to the default template, but we needed
to add a line for our local environment. For most sites the default templates
should be a good starting point.
------------------------------------------------------------------------------
.TorqueSpawner.batch_script = '''#!/bin/sh
PBS -q {queue}@{host}
PBS -l walltime={runtime}
PBS -l nodes=1:ppn={nprocs}
PBS -l mem={memory}
PBS -N jupyterhub-singleuser
PBS -v {keepvars}
odule load python3
cmd}
''
For our site we need to munge the execution hostname returned by qstat
.TorqueSpawner.state_exechost_exp = r'int-\1.mesabi.xyz.edu'
ProfilesSpawner
, available as part of the wrapspawner
package, allows the Jupyterhub administrator to define a set of different spawning configurations,
both different spawners and different configurations of the same spawner.
The user is then presented a dropdown menu for choosing the most suitable configuration for their needs.
This method provides an easy and safe way to provide different configurations of BatchSpawner
to the
users, see an example below.
The following is based on the author's configuration (at the same site as the example above) showing how to give users access to multiple job configurations on the batch scheduled clusters, as well as an option to run a local notebook directly on the jupyterhub server.
Same initial setup as the previous example
.JupyterHub.spawner_class = 'wrapspawner.ProfilesSpawner'
.Spawner.http_timeout = 120
------------------------------------------------------------------------------
BatchSpawnerBase configuration
Providing default values that we may omit in the profiles
------------------------------------------------------------------------------
.BatchSpawnerBase.req_host = 'mesabi.xyz.edu'
.BatchSpawnerBase.req_runtime = '12:00:00'
.TorqueSpawner.state_exechost_exp = r'in-\1.mesabi.xyz.edu'
------------------------------------------------------------------------------
ProfilesSpawner configuration
------------------------------------------------------------------------------
List of profiles to offer for selection. Signature is:
List(Tuple( Unicode, Unicode, Type(Spawner), Dict ))
corresponding to profile display name, unique key, Spawner class,
dictionary of spawner config options.
The first three values will be exposed in the input_template as {display},
{key}, and {type}
.ProfilesSpawner.profiles = [
( "Local server", 'local', 'jupyterhub.spawner.LocalProcessSpawner', {'ip':'0.0.0.0'} ),
('Mesabi - 2 cores, 4 GB, 8 hours', 'mesabi2c4g12h', 'batchspawner.TorqueSpawner',
dict(req_nprocs='2', req_queue='mesabi', req_runtime='8:00:00', req_memory='4gb')),
('Mesabi - 12 cores, 128 GB, 4 hours', 'mesabi128gb', 'batchspawner.TorqueSpawner',
dict(req_nprocs='12', req_queue='ram256g', req_runtime='4:00:00', req_memory='125gb')),
('Mesabi - 2 cores, 4 GB, 24 hours', 'mesabi2c4gb24h', 'batchspawner.TorqueSpawner',
dict(req_nprocs='2', req_queue='mesabi', req_runtime='24:00:00', req_memory='4gb')),
('Interactive Cluster - 2 cores, 4 GB, 8 hours', 'lab', 'batchspawner.TorqueSpawner',
dict(req_nprocs='2', req_host='labhost.xyz.edu', req_queue='lab',
req_runtime='8:00:00', req_memory='4gb', state_exechost_exp='')),
]
--uid
for (at least) Slurm 17.11 compatibility. If you use sudo
, this should not be necessary, but because this is security related you should check that user management is as you expect. If your configuration does not use sudo
then you may need to add the --uid
option in a custom batch_script
.req_ngpus
req_partition
req_account
and req_options
user_options
with the template substitution vars instead of having it as a separate keyLICENSE
(BSD3) and CONTRIBUTING.md
LsfSpawner
for IBM LFSMultiSlurmSpawner
MoabSpawner
condorSpawner
GridEngineSpawner
req_qos
optionwrapspawner
packageTorqueSpawner
and SlurmSpawner