Name: conveyer
Owner: racker
Description: Connector between logstash and Rackspace's Cloud Monitoring Agent plugin.
Forked from: sam-falvo/conveyer
Created: 2015-10-18 08:22:56.0
Updated: 2015-10-18 08:22:56.0
Pushed: 2015-11-18 01:34:13.0
Homepage: null
Size: 24
Language: Python
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
At least as of this commit, Logstash does not seem to rotate (high-load) logs written to a file on-demand. This precludes using external tools, like Rackspace Cloud Monitoring Agent plugins, to monitor metrics emitted by Logstash in a finite amount of disk space. Using Conveyer, we take Logstash-generated events and write them to a file. Unlike Logstash's file output configuration, Conveyer provides the ability to rotate the log file it writes to on-demand. Through this guaranteed atomic mechanism, logs may be consumed by tooling while providing a fresh log file for Logstash's use. Meanwhile, Logstash is completely ignorant of what's happening.
Typically, Conveyer would be installed using an orchestration tool such as Chef. However, you should know how to install it manually. The following steps are what I use to get a working deployment on a completely fresh Ubuntu virtual machine.
get update
get install pip
install --upgrade pip
get uninstall pip
install --upgrade virtualenv
clone git@github.com/sam-falvo/conveyer
onveyer
ualenv .ve
e/bin/activate
install -e .
The first apt-get
invokation updates the package database, while the second installs Ubuntu's default pip
version.
I then use pip
to replace itself with the latest version,
and remove Ubuntu's copy, for it is now irrelevant.
I then install virtualenv
if it's not already installed.
After cloning this repository, I create a virtualenv for Conveyer, and install Conveyer itself therein.
At this point, Conveyer is now properly installed. Understanding these steps should give you the background you need to properly configure the orchestration tool of your choice.
If your host has the Rackspace Cloud Monitoring Agent installed, you can install the Conveyer Plugin fairly trivially:
rt RMA=/usr/lib/rackspace-monitoring-agent/plugins
lugins/conveyer-plugin.py $RMA
n --reference=$RMA $RMA/conveyer-plugin.py
lugins/conveyer-plugin.json /etc
n --reference=$RMA /etc/conveyer-plugin.json
You can launch the Conveyer daemon using the following command:
on conveyer/conveyer.py
You may optionally configure how it runs with a set of three environment variables:
|Variable|Default|Purpose|
|:——-|:—–:|:——|
|CONVEYER_PORT|10100|This determines which TCP/IP port Conveyer will listen for POST requests on.|
|CONVEYER_HOST|localhost|This determines the interface(s) Conveyer will listen on.|
|CONVEYER_LOGS|/tmp/logs|This determines the temporary repository of logs that Conveyer will store events in. This file is what's rotated upon receiving a /rotate
request.|
For example, a fully customized invokation of Conveyer might look like this:
EYER_PORT=8080 CONVEYER_HOST=::1 CONVEYER_LOGS=/home/myself/mylogs python conveyer/conveyer.py
Assuming you installed the plugin correctly, it should run automatically whenever Rackspace Cloud Monitoring polls your server. However, you'll need to configure cloud monitoring itself to poll the agent with a custom check.
foo-check
.python conveyer-plugin.py
.After a while, you should start to see graphs appear under Visualize » Custom Graphs.
Below reads a sample configuration,
which may be found by default in /etc/conveyer-plugin.json
,
understood by the Conveyer Plugin:
"state_file_name": "/tmp/plugin-state-file.json",
"conveyer_url": "http://localhost:10100/rotate",
"metrics": [
"failure.server.launch.count",
"failure.authentication.identity.count",
]
The settings have the following meaning:
|Setting|Meaning|
|:—-|:—-|
|state_file_name
|When Conveyer runs, it must record running information in order to translate Logstash-provided counts into gauge measurements that Cloud Monitoring expects. This running data must be persisted from run to run. This setting establishes where those settings are stored.|
|conveyer_url
|This URL will be POSTed to in order to ask Conveyer daemon to rotate the logs.|
|metrics
|An array of metrics as gathered by Logstash. You'll probably want to tailor these to your specific needs.|
If you wish to store the configuration elsewhere,
you may launch the plugin with the environment variable
CONVEYER_CONVEYER_AGENT_PLUGIN_CONFIG
set to a different location, like so:
EYER_AGENT_PLUGIN_CONFIG=/var/conveyer/config.json python conveyer-plugin.py
You can test Conveyer with a local deployment of Logstash. I use the following Logstash configuration for this purpose:
t {
generator {
message => "Launching server failed: UpstreamError('identity error: 401 - Unable to authenticate user with credentials provided.',)"
}
er {
if [message] =~ /.*Launching server failed.*/ {
metrics {
meter => ["failure.server.launch"]
add_tag => ["metric"]
flush_interval => 1
}
}
if [message] =~ /identity error: 401/ {
metrics {
meter => ["failure.authentication.identity"]
add_tag => ["metric"]
flush_interval => 1
}
}
ut {
if "metric" in [tags] {
http {
http_method => "post"
url => "http://localhost:10100/log"
}
}
file {
path => "/tmp/ls.log"
}
Save this configuration file inside Logstash's directory,
say, as test.config
,
and invoke it as follows:
logstash agent -f test.config
You should see Logstash
produce a ton of log entries
in /tmp/ls.log
.
Periodically, though,
it should generate metrics
and forward them to Conveyer.
You should see Conveyer deposit events
in /tmp/conveyer-logs
approximately once every second.
The flush rate is determined by the flush_interval
setting
in test.config
.
Finally, if you execute:
-X POST http://localhost:10100/rotate
You should see the name of the rotated logfile,
and /tmp/conveyer-logs
should start to fill with all-new content.
Conveyer basically consists of two parts.
The daemon is intended to sit on a Logstash instance,
which has an HTTP output
configured to POST
to the conveyer daemon's /log
endpoint.
The conveyer writes any received data to a log file.
So far, this sounds just like what you'd expect from a file output configuration.
When an external tool wants access to the logs safely,
it executes a POST request against the conveyer daemon's /rotate
endpoint.
Conveyer will then close the log file it's currently using,
rename it to a new temporary name,
(lazily) creates a new log file for subsequent Logstash requests,
then return the name of the old file to the client.
The log consumer is responsible for deleting the log file it's given when it's done using it.
NOTE: Right now, Conveyer statically determines the name of the renamed logfile. This means that rotation will overwrite any previous rotation. Please consider this an implementation detail, and try not to depend on this behavior. Always use the filename returned from Conveyer to gain access to the rotated logs.
We needed to monitor and alert on some aggregated metrics. We configured Logstash to do this for us, but we found that our monitoring agent could not work with this data directly. We needed to write a custom plug-in to make everything work together.
I first thought of using a daemon to keep everything in memory; however, I came to the realization that if the process died for any reason, we'd lose more data than we were comfortable with. Writing data to a file seemed a natural choice to minimize data loss.
At this point, I tried looking for solutions for log rotation. I wasn't happy with anything that I saw with a day or two of research. So, knowing I can write my own solution in at least as much time, I decided to write my own solution to the problem.