voxmedia/carbon-relay-ng

Name: carbon-relay-ng

Owner: Vox Media

Description: Fast carbon relay+aggregator with admin interfaces for making changes online - production ready

Created: 2015-09-01 17:44:19.0

Updated: 2015-09-01 17:44:19.0

Pushed: 2015-09-01 12:12:41.0

Homepage:

Size: 2076

Language: Go

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

carbon-relay-ng

A relay for carbon streams, in go. Like carbon-relay from the graphite project, except it:

This makes it easy to fanout to other tools that feed in on the metrics. Or balance/split load, or provide redundancy, or partition the data, etc. This pattern allows alerting and event processing systems to act on the data as it is received (which is much better than repeated reading from your storage)

screenshot

Future work aka what's missing
Releases & versions

see https://github.com/graphite-ng/carbon-relay-ng/releases

Instrumentation

grafana dashboard

Building

Requires Go 1.4 or higher. we use https://github.com/mjibson/party to manage vendoring 3rd party libraries

export GOPATH=/some/path/
export PATH="$PATH:$GOPATH/bin"
go get -d github.com/graphite-ng/carbon-relay-ng
go get github.com/jteeuwen/go-bindata/...
cd "$GOPATH/src/github.com/graphite-ng/carbon-relay-ng"
# optional: check out an older version: git checkout v0.5
make
Installation
You only need the compiled binary and a config.  Put them whereever you want.
Usage
carbon-relay-ng [-cpuprofile cpuprofile-file] config-file
Concepts

You have 1 master routing table. This table contains 0-N routes. Each route can contain 0-M destinations (tcp endpoints)

First: “matching”: you can match metrics on one or more of: prefix, substring, or regex. All 3 default to “” (empty string, i.e. allow all). The conditions are AND-ed. Regexes are more resource intensive and hence should, and often can be avoided.

carbon-relay-ng (for now) focuses on staying up and not consuming much resources.

if connection is up but slow, we drop the data if connection is down and spooling enabled. we try to spool but if it's slow we drop the data if connection is down and spooling disabled -> drop the data

Validation

All incoming metrics undergo some basic sanity checks before the metrics go into the routing table. We check that the metric:

If we detect the metric is in metrics2.0 format we also check proper formatting, and unit and target_type are set.

Invalid metrics are dropped and can be seen at /badMetrics/timespec.json where timespec is something like 30s, 10m, 24h, etc. (the counters are also exported. See instrumentation section)

Aggregation

As discussed in concepts above, we can combine, at each point in time, the points of multiple series into a new series. Note:

Configuration

Look at the included carbon-relay-ng.ini, it should be self describing. In the init option you can create routes, populate the blacklist, etc using the same command as the telnet interface, detailed below. This mechanism is choosen so we can reuse the code, instead of doing much configuration boilerplate code which would have to execute on a declarative specification. We can just use the same imperative commands since we just set up the initial state here.

TCP interface

commands:

help                                         show this menu
view                                         view full current routing table

addBlack <prefix|sub|regex> <substring>      blacklist (drops matching metrics as soon as they are received)

addAgg <func> <regex> <fmt> <interval> <wait>  add a new aggregation rule.
         <func>:                             aggregation function to use
           sum
           avg
         <regex>                             regex to match incoming metrics. supports groups (numbered, see fmt)
         <fmt>                               format of output metric. you can use $1, $2, etc to refer to numbered groups
         <interval>                          align odd timestamps of metrics into buckets by this interval in seconds.
         <wait>                              amount of seconds to wait for "late" metric messages before computing and flushing final result.


addRoute <type> <key> [opts]   <dest>  [<dest>[...]] add a new route. note 2 spaces to separate destinations
         <type>:
           sendAllMatch                      send metrics in the route to all destinations
           sendFirstMatch                    send metrics in the route to the first one that matches it
           consistentHashing                 distribute metrics between destinations using a hash algorithm
         <opts>:
           prefix=<str>                      only take in metrics that have this prefix
           sub=<str>                         only take in metrics that match this substring
           regex=<regex>                     only take in metrics that match this regex (expensive!)
         <dest>: <addr> <opts>
           <addr>                            a tcp endpoint. i.e. ip:port or hostname:port
                                             for consistentHashing routes, an instance identifier can also be present:
                                             hostname:port:instance
                                             The instance is used to disambiguate multiple endpoints on the same host, as the Carbon-compatible consistent hashing algorithm does not take the port into account.
           <opts>:
               prefix=<str>                  only take in metrics that have this prefix
               sub=<str>                     only take in metrics that match this substring
               regex=<regex>                 only take in metrics that match this regex (expensive!)
               flush=<int>                   flush interval in ms
               reconn=<int>                  reconnection interval in ms
               pickle={true,false}           pickle output format instead of the default text protocol
               spool={true,false}            enable spooling for this endpoint

addDest <routeKey> <dest>                    not implemented yet

modDest <routeKey> <dest> <opts>:            modify dest by updating one or more space separated option strings
               addr=<addr>                   new tcp address
               prefix=<str>                  new matcher prefix
               sub=<str>                     new matcher substring
               regex=<regex>                 new matcher regex

modRoute <routeKey> <opts>:                  modify route by updating one or more space separated option strings
               prefix=<str>                  new matcher prefix
               sub=<str>                     new matcher substring
               regex=<regex>                 new matcher regex

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.