metabrainz/mbstats

Name: mbstats

Owner: MetaBrainz Foundation

Description: Work in progress

Created: 2016-11-16 18:47:50.0

Updated: 2016-11-22 10:47:22.0

Pushed: 2016-11-29 12:15:03.0

Homepage: null

Size: 108

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

e: stats.parser.py [-h] [-f FILE] [-c FILE] [-d DATACENTER] [-H HOSTNAME]
                   [-l LOG_DIR] [-n NAME] [-m MAX_LINES] [-w WORKDIR] [-y]
                   [-q] [--influx-host INFLUX_HOST]
                   [--influx-port INFLUX_PORT]
                   [--influx-username INFLUX_USERNAME]
                   [--influx-password INFLUX_PASSWORD]
                   [--influx-database INFLUX_DATABASE]
                   [--influx-timeout INFLUX_TIMEOUT]
                   [--influx-batch-size INFLUX_BATCH_SIZE] [-D]
                   [--influx-drop-database] [--locker {fcntl,portalocker}]
                   [--lookback-factor LOOKBACK_FACTOR] [--startover]
                   [--do-not-skip-to-end]
                   [--bucket-duration BUCKET_DURATION]
                   [--log-conf LOG_CONF] [--dump-config] [--syslog]
                   [--send-failure-fifo-size SEND_FAILURE_FIFO_SIZE]
                   [--simulate-send-failure]

 and parse a formatted nginx log file, sending results to InfluxDB.

onal arguments:
, --help            show this help message and exit

ired arguments:
 FILE, --file FILE  log file to process

on arguments:
 FILE, --config FILE
                    Specify json config file(s)
 DATACENTER, --datacenter DATACENTER
                    string to use as 'dc' tag
 HOSTNAME, --hostname HOSTNAME
                    string to use as 'host' tag
 LOG_DIR, --log-dir LOG_DIR
                    Where to store the stats.parser logfile. Default
                    location is workdir
 NAME, --name NAME  string to use as 'name' tag
 MAX_LINES, --max-lines MAX_LINES
                    maximum number of lines to process
 WORKDIR, --workdir WORKDIR
                    directory where offset/status are stored
, --dry-run         Parse the log file but send stats to standard output
, --quiet           Reduce verbosity / quiet mode

uxdb arguments:
influx-host INFLUX_HOST
                    influxdb host
influx-port INFLUX_PORT
                    influxdb port
influx-username INFLUX_USERNAME
                    influxdb username
influx-password INFLUX_PASSWORD
                    influxdb password
influx-database INFLUX_DATABASE
                    influxdb database
influx-timeout INFLUX_TIMEOUT
                    influxdb timeout
influx-batch-size INFLUX_BATCH_SIZE
                    number of points to send per batch

rt arguments:
, --debug           Enable debug mode
influx-drop-database
                    drop existing InfluxDB database, use with care
locker {fcntl,portalocker}
                    type of lock to use
lookback-factor LOOKBACK_FACTOR
                    number of buckets to wait before sending any data
startover           ignore all status/offset, like a first run
do-not-skip-to-end  do not skip to end on first run
bucket-duration BUCKET_DURATION
                    duration for each bucket in seconds
log-conf LOG_CONF   Logging configuration file. None by default
dump-config         dump config as json to stdout
syslog              Log to syslog
send-failure-fifo-size SEND_FAILURE_FIFO_SIZE
                    Number of failed sends to backup
simulate-send-failure
                    Simulate send failure for testing purposes

se add following to http section of your nginx configuration:

g_format stats
'1|'
'$msec|'
'$host|'
'$statproto|'
'$loctag|'
'$status|'
'$bytes_sent|'
'$gzip_ratio|'
'$request_length|'
'$request_time|'
'$upstream_addr|'
'$upstream_status|'
'$upstream_response_time|'
'$upstream_connect_time|'
'$upstream_header_time';

p $host $loctag {
default '-';


p $https $statproto {
default '-';
on 's';


can use $loctag to tag a specific location:
set $loctag "ws";

ddition of your usual access log, add something like:
access_log /var/log/nginx/my.stats.log stats buffer=256k flush=10s

: first field in stats format declaration is a format version, it should be set to 1.

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.