Name: sipid
Owner: Cloud Foundry
Description: null
Created: 2017-02-28 22:58:26.0
Updated: 2018-03-10 05:22:00.0
Pushed: 2017-07-07 07:00:35.0
Homepage: null
Size: 39
Language: Go
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
sipid
intends to give BOSH release authors an easier way to manage pidfiles. Pidfiles are used by Monit (and therefore
BOSH) to track which process should be monitored. It is the responsibility of a BOSH job to write its process ID (PID)
to the pidfile during start, and reference the same pidfile to find the process to kill during stop.
Correct pidfile management has a couple potential pitfalls, since your scripts may be called multiple times and result
in race conditions. To make this simpler, sipid
provides simple claim
and kill
commands that manage the trickiest
parts of pidfiles.
sipid claim --pid PID --pid-file PID_FILE
will write the given process's PID to the PID_FILE. It's algorithm looks
roughly like this:
sr/bin/env bash
DIR="/var/vcap/sys/run/example-job"
ILE="$RUN_DIR/web.pid"
r -p "$RUN_DIR"
d claim --pid "$$" --pid-file "$PIDFILE"
chpst -u vcap:vcap /var/vcap/packages/example-job/bin/web
sr/bin/env bash
DIR="/var/vcap/sys/run/example-job"
ILE="$RUN_DIR/web.pid"
r -p "$RUN_DIR"
t-stop-daemon \
pidfile "$PIDFILE" \
make-pidfile \
chuid vcap:vcap \
start \
exec /var/vcap/packages/example-job/bin/web
\
--extra arguments \
--to-your process
sipid kill --pid-file PID_FILE [--show-stacks]
will kill the process given by the PID_FILE. Monit only allows a short
time to stop a process, so we must kill the process aggressively if it does not clean itself up within a 20-second
grace period. The algorithm looks roughly like this:
SIGTERM
(i.e. a normal kill "$PID"
) to the process to give it time to clean up.SIGKILL
to the process to force it to exit immediately.claim
from failing if the PID is reused by a different process laterIf the --show-stacks
parameter is provided to sipid, before sending SIGKILL
, it will attempt to get the process to
dump its stack traces by sending SIGQUIT
(i.e. kill -3 "$PID"
) to aid with debugging a “stuck” process. Not all
processes respond to SIGQUIT
, and if yours does not, you may wish to implement a SIGQUIT
handler to make debugging
more consistent for operators.
sr/bin/env bash
a command fails, exit immediately
-e
ILE="/var/vcap/sys/run/example-job/web.pid"
d kill --pid-file "$PIDFILE" --show-stacks
sr/bin/env bash
a command fails, exit immediately
-e
ILE="/var/vcap/sys/run/example-job/web.pid"
t-stop-daemon \
pidfile "$PIDFILE" \
remove-pidfile \
retry TERM/20/QUIT/1/KILL \
oknodo \
stop
sipid wait-until-healthy --url HEALTHCHECK_URL [--timeout DURATION (default 1m)] [--polling-frequency DURATION (default 5s)]
will continually poll a healthcheck endpoint (at the requested frequency, until the requested timeout) until it returns
an HTTP 200 status code. If the healthcheck is not healthy by the timeout deadline, the process will exit non-zero.
sr/bin/env bash
a command fails, exit immediately
-e
d wait-until-healthy --url https://127.0.0.1:58074/healthcheck --timeout 2m --polling-frequency 1s
To see examples of sipid
in action, look at the scripts in the example/ directory.