Name: sre-ntpq-snap-plugin-py
Owner: Librato
Description: An ntpq plugin for the snap framework which implements the checks we need for our infrastructure
Created: 2017-11-24 12:41:13.0
Updated: 2018-03-22 20:55:41.0
Pushed: 2018-03-22 20:59:57.0
Homepage: null
Size: 30
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
An ntpq plugin for the snap framework which implements the checks we need for our infrastructure
Install bazel (https://bazel.build/)
bazel build snap-plugin-ntpq-query.par --build_python_zip
When done, you will have bazel-bin/snap-plugin-ntpq-query.zip
which
you can run with python bazel-bin/snap-plugin-ntpq-query.zip
.
Once committed to master, tag the release and build:
pacey@masonjar:~/dvcs/librato/sre-ntpq-snap-plugin-py$ ./release.sh stage
ing and bringing stage forward
ady on 'master'
branch is up-to-date with 'origin/master'.
github.com:librato/sre-ntpq-snap-plugin-py
ranch master -> FETCH_HEAD
ady up-to-date.
ting objects: 1, done.
ing objects: 100% (1/1), 167 bytes | 167.00 KiB/s, done.
l 1 (delta 0), reused 0 (delta 0)
ithub.com:librato/sre-ntpq-snap-plugin-py
new tag] v4 -> v4
l 0 (delta 0), reused 0 (delta 0)
ithub.com:librato/sre-ntpq-snap-plugin-py
c5ff95..7e0fe83 v4^{commit} -> stage
Upload the zipfile and the wrapper script to s3:
pacey@masonjar:~/dvcs/librato/sre-ntpq-snap-plugin-py$ aws s3 cp bazel-bin/snap-plugin-ntpq-query.zip s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query.zip
ad: bazel-bin/snap-plugin-ntpq-query.zip to s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query.zip
pacey@masonjar:~/dvcs/librato/sre-ntpq-snap-plugin-py$ aws s3 cp bin/snap-plugin-ntpq-query s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query
ad: bin/snap-plugin-ntpq-query to s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query
Then update the salt pillar pillar/role_salt_minion/init.sls
to update the aoagent
keys. I'm using this at the moment:
ent:
wnload_plugins:
snap-plugin-ntpq-query: "aws s3 cp s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query.zip /opt/appoptics/bin"
snap-plugin-ntpq-query-wrapper: "aws s3 cp s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query /opt/appoptics/bin"
In appoptics, this is the pillar configuraton that we're using:
wnload_plugins:
snap-plugin-ntpq-query:
command: "aws s3 cp s3://librato-apt-stg/par/v8/snap-plugin-ntpq-query.zip /opt/appoptics/bin"
unless: test -f /opt/appoptics/bin/snap-plugin-ntpq-query.zip
snap-plugin-ntpq-query-wrapper:
command: "aws s3 cp s3://librato-apt-stg/par/v8/snap-plugin-ntpq-query /opt/appoptics/bin; chmod +x /opt/appoptics/bin/snap-plugin-ntpq-query"
unless: "test -f /opt/appoptics/bin/snap-plugin-ntpq-query"
abled_plugins:
ntpq:
collector:
# Get standard NTP query metrics. Requires ntpq executable.
ntpq:
all: {}
load:
plugin: snap-plugin-ntpq-query
task: task-sre-ntpq.yaml
abled_tasks:
sre-ntpq:
version: 1
schedule:
# Run every minute
type: cron
interval: "0 * * * * *"
workflow:
collect:
metrics:
/ntpq/*: {}
publish:
- plugin_name: publisher-appoptics
config:
period: 60
floor_seconds: 60
To alert, create a composite on the topk (say, 10) of each of the relevant metrics, and alert on the threshold that is of interest.
We are concerned about the number of associations falling below 2 (that is, they're running out of valid hosts to associate with).
So use this composite:
s("ntpq.associations", {"@host":"*"}), {"count": "10","function": "min"})
and create the composite metric as ops.ntpq.associations_min
and alert on that.