Name: puppet-prometheus
Owner: Vox Pupuli
Description: Puppet module for prometheus
Forked from: bastelfreak/puppet-prometheus2
Created: 2016-02-29 15:20:39.0
Updated: 2017-11-30 22:05:00.0
Pushed: 2018-01-17 16:50:32.0
Homepage: null
Size: 289
Language: Puppet
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
| Prometheus Version | Recommended Puppet Module Version | | —————- | ———————————– | | >= 0.16.2 | latest |
node_exporter >= 0.15.0 consul_exporter >= 0.3.0
This module automates the install and configuration of Prometheus monitoring tool: Prometheus web site
To set up a prometheus daemon: On the server (for prometheus version < 1.0.0):
s { '::prometheus':
obal_config => { 'scrape_interval'=> '15s', 'evaluation_interval'=> '15s', 'external_labels'=> { 'monitor'=>'master'}},
le_files => [ "/etc/prometheus/alert.rules" ],
rape_configs => [
{ 'job_name'=> 'prometheus',
'scrape_interval'=> '10s',
'scrape_timeout' => '10s',
'target_groups' => [
{ 'targets' => [ 'localhost:9090' ],
'labels' => { 'alias'=> 'Prometheus'}
}
]
}
On the server (for prometheus version >= 1.0.0):
s { 'prometheus':
version => '1.0.0',
scrape_configs => [ {'job_name'=>'prometheus','scrape_interval'=> '30s','scrape_timeout'=>'30s','static_configs'=> [{'targets'=>['localhost:9090'], 'labels'=> { 'alias'=>'Prometheus'}}]}],
extra_options => '-alertmanager.url http://localhost:9093 -web.console.templates=/opt/prometheus-1.0.0.linux-amd64/consoles -web.console.libraries=/opt/prometheus-1.0.0.linux-amd64/console_libraries',
localstorage => '/prometheus/prometheus',
On the server (for prometheus version >= 2.0.0):
s { '::prometheus':
version => '2.0.0',
alerts => { 'groups' => [{ 'name' => 'alert.rules', 'rules' => [{ 'alert' => 'InstanceDown', 'expr' => 'up == 0', 'for' => '5m', 'labels' => { 'severity' => 'page', }, 'annotations' => { 'summary' => 'Instance {{ $labels.instance }} down', 'description' => '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.' }}]}]},
scrape_configs => [
{ 'job_name' => 'prometheus',
'scrape_interval' => '10s',
'scrape_timeout' => '10s',
'static_configs' => [
{ 'targets' => [ 'localhost:9090' ],
'labels' => { 'alias' => 'Prometheus'}
}
]
}
or simply:
ude ::prometheus
To add alert rules, add the following to the class prometheus in case you are using prometheus < 2.0:
alerts => [{ 'name' => 'InstanceDown', 'condition' => 'up == 0', 'timeduration' => '5m', labels => [{ 'name' => 'severity', 'content' => 'page'}], 'annotations' => [{ 'name' => 'summary', content => 'Instance {{ $labels.instance }} down'}, {'name' => 'description', content => '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.' }]}]
or in hiera:
trules:
-
name: 'InstanceDown'
condition: 'up == 0'
timeduration: '5m'
labels:
-
name: 'severity'
content: 'critical'
annotations:
-
name: 'summary'
content: 'Instance {{ $labels.instance }} down'
-
name: 'description'
content: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
When using prometheus >= 2.0, we use the new yaml format (https://prometheus.io/docs/prometheus/2.0/migration/#recording-rules-and-alerts) configuration
alerts => { 'groups' => [{ 'name' => 'alert.rules', 'rules' => [{ 'alert' => 'InstanceDown', 'expr' => 'up == 0', 'for' => '5m', 'labels' => { 'severity' => 'page', }, 'annotations' => { 'summary' => 'Instance {{ $labels.instance }} down', 'description' => '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.' } }]}]},
aml
ts:
oups:
- name: alert.rules
rules:
- alert: 'InstanceDown'
expr: 'up == 0'
for: '5m'
labels:
'severity': 'page'
annotations:
'summary': 'Instance {{ $labels.instance }} down'
'description': '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
On the monitored nodes:
nclude prometheus::node_exporter
or:
s { 'prometheus::node_exporter':
version => '0.12.0',
collectors_disable => ['loadavg','mdadm' ],
extra_options => '--collector.ntp.server ntp1.orange.intra',
For more information regarding class parameters please take a look at class docstring.
Real Prometheus >=2.0.0 setup example including alertmanager and slack_configs.
ude profiles::prometheus
s { '::prometheus':
rsion => '2.0.0',
erts => { 'groups' => [{ 'name' => 'alert.rules', 'rules' => [{ 'alert' => 'InstanceDown', 'expr' => 'up == 0', 'for' => '5m', 'labels' => { 'severity' => 'page', }, 'annotations' => { 'summary' => 'Instance {{ $labels.instance }} down', 'description' => '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.' } }]}]},
rape_configs => [
{ 'job_name' => 'prometheus',
'scrape_interval' => '10s',
'scrape_timeout' => '10s',
'static_configs' => [
{ 'targets' => [ 'localhost:9090' ],
'labels' => { 'alias'=> 'Prometheus'}
}
]
},
{ 'job_name' => 'node',
'scrape_interval' => '5s',
'scrape_timeout' => '5s',
'static_configs' => [
{ 'targets' => [ 'nodexporter.domain.com:9100' ],
'labels' => { 'alias'=> 'Node'}
}
]
}
ertmanagers_config => [{ 'static_configs' => [{'targets' => [ 'localhost:9093' ]}]}],
s { '::prometheus::alertmanager':
rsion => '0.13.0',
ute => { 'group_by' => [ 'alertname', 'cluster', 'service' ], 'group_wait'=> '30s', 'group_interval'=> '5m', 'repeat_interval'=> '3h', 'receiver'=> 'slack' },
ceivers => [ { 'name' => 'slack', 'slack_configs'=> [ { 'api_url'=> 'https://hooks.slack.com/services/ABCDEFG123456', 'channel' => '#channel', 'send_resolved' => true, 'username' => 'username'}] }]
the same in hiera
theus::version: '2.0.0'
etheus::scrape_configs:
- job_name: 'nodexporter'
scrape_interval: '10s'
scrape_timeout: '10s'
static_configs:
- targets:
- nodexporter.domain.com:9100
labels:
alias: 'nodexporter'
- job_name: prometheus
scrape_interval: 10s
scrape_timeout: 10s
static_configs:
- targets:
- localhost:9090
labels:
alias: Prometheus
etheus::alerts:
oups:
- name: alert.rules
rules:
- alert: 'InstanceDown'
expr: 'up == 0'
for: '5m'
labels:
'severity': 'page'
annotations:
'summary': 'Instance {{ $labels.instance }} down'
'description': '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
etheus::alertmanagers_config:
static_configs:
- targets:
- localhost:9093
etheus::alertmanager::version: '0.13.0'
etheus::alertmanager::route:
oup_by:
alertname
cluster
service
oup_wait: 30s
oup_interval: 5m
peat_interval: 3h
ceiver: slack
etheus::alertmanager::receivers:
name: slack
slack_configs:
- api_url: https://hooks.slack.com/services/ABCDEFG123456
channel: "#channel"
send_resolved: true
username: username
Test you commit with vagrant https://github.com/kalinux/vagrant-puppet-prometheus.git
In version 0.1.14 of this module the alertmanager was configured to run as the service alert_manager
. This has been changed in version 0.2.00 to be alertmanager
.
Do not use version 1.0.0 of Prometheus: https://groups.google.com/forum/#!topic/prometheus-developers/vuSIxxUDff8 ; it does break the compatibility with thus module!
Even if the module has templates for several linux distributions, only RH family distributions were tested.