librato/sre-ntpq-snap-plugin-py

Name: sre-ntpq-snap-plugin-py

Owner: Librato

Description: An ntpq plugin for the snap framework which implements the checks we need for our infrastructure

Created: 2017-11-24 12:41:13.0

Updated: 2018-03-22 20:55:41.0

Pushed: 2018-03-22 20:59:57.0

Homepage: null

Size: 30

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

sre-ntpq-snap-plugin-py

An ntpq plugin for the snap framework which implements the checks we need for our infrastructure

Building

Install bazel (https://bazel.build/)

bazel build snap-plugin-ntpq-query.par --build_python_zip

When done, you will have bazel-bin/snap-plugin-ntpq-query.zip which you can run with python bazel-bin/snap-plugin-ntpq-query.zip.

Deploying

Once committed to master, tag the release and build:

pacey@masonjar:~/dvcs/librato/sre-ntpq-snap-plugin-py$ ./release.sh stage
ing and bringing stage forward
ady on 'master'
 branch is up-to-date with 'origin/master'.
 github.com:librato/sre-ntpq-snap-plugin-py
ranch            master     -> FETCH_HEAD
ady up-to-date.
ting objects: 1, done.
ing objects: 100% (1/1), 167 bytes | 167.00 KiB/s, done.
l 1 (delta 0), reused 0 (delta 0)
ithub.com:librato/sre-ntpq-snap-plugin-py
new tag]         v4 -> v4
l 0 (delta 0), reused 0 (delta 0)
ithub.com:librato/sre-ntpq-snap-plugin-py
c5ff95..7e0fe83  v4^{commit} -> stage

Upload the zipfile and the wrapper script to s3:

pacey@masonjar:~/dvcs/librato/sre-ntpq-snap-plugin-py$ aws s3 cp bazel-bin/snap-plugin-ntpq-query.zip s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query.zip
ad: bazel-bin/snap-plugin-ntpq-query.zip to s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query.zip
pacey@masonjar:~/dvcs/librato/sre-ntpq-snap-plugin-py$ aws s3 cp bin/snap-plugin-ntpq-query s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query
ad: bin/snap-plugin-ntpq-query to s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query

Then update the salt pillar pillar/role_salt_minion/init.sls to update the aoagent keys. I'm using this at the moment:

ent:
wnload_plugins:
snap-plugin-ntpq-query: "aws s3 cp s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query.zip /opt/appoptics/bin"
snap-plugin-ntpq-query-wrapper: "aws s3 cp s3://librato-apt-stg/par/v9/snap-plugin-ntpq-query /opt/appoptics/bin"
A complete pillar example

In appoptics, this is the pillar configuraton that we're using:

wnload_plugins:
snap-plugin-ntpq-query:
  command: "aws s3 cp s3://librato-apt-stg/par/v8/snap-plugin-ntpq-query.zip /opt/appoptics/bin"
  unless: test -f /opt/appoptics/bin/snap-plugin-ntpq-query.zip
snap-plugin-ntpq-query-wrapper: 
  command: "aws s3 cp s3://librato-apt-stg/par/v8/snap-plugin-ntpq-query /opt/appoptics/bin; chmod +x /opt/appoptics/bin/snap-plugin-ntpq-query"
  unless: "test -f /opt/appoptics/bin/snap-plugin-ntpq-query"
abled_plugins:
ntpq:
  collector:
    # Get standard NTP query metrics. Requires ntpq executable.
    ntpq:
      all: {}
  load:
    plugin: snap-plugin-ntpq-query
    task: task-sre-ntpq.yaml
abled_tasks:
sre-ntpq:
  version: 1
  schedule:
    # Run every minute
    type: cron
    interval: "0 * * * * *"
  workflow:
    collect:
      metrics:
        /ntpq/*: {}
      publish:
        - plugin_name: publisher-appoptics
          config:
            period: 60
            floor_seconds: 60
Alerting

To alert, create a composite on the topk (say, 10) of each of the relevant metrics, and alert on the threshold that is of interest.

Associations

We are concerned about the number of associations falling below 2 (that is, they're running out of valid hosts to associate with).

So use this composite:

s("ntpq.associations", {"@host":"*"}), {"count": "10","function": "min"})

and create the composite metric as ops.ntpq.associations_min and alert on that.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.