Name: prometheus-on-PCF
Owner: Pivotal Cloud Foundry
Description: This is a how-to for deploying https://github.com/cloudfoundry-community/prometheus-boshrelease to monitor Pivotal Cloud Foundry.
Created: 2017-05-10 18:40:01.0
Updated: 2018-01-18 12:06:04.0
Pushed: 2017-12-19 18:46:27.0
Size: 158
Language: Shell
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
DEPRECATED
For PCF 1.12+ please use the new pipeline available here: https://github.com/pivotal-cf/pcf-prometheus-pipeline. This repo will remain here for the time being but please migrate to the new pipeline as soon as possible.
This how-to has been tested on PCF 1.8-1.11. The manifest file is appropriate for cloud-config enabled environments.
The manifest example is split into the main part which should not require any customization (at least initially) and the local configuration which has to be adjusted. To merge those files we are using the new BOSH CLI. Documentation is available here.
This is a high-level overview of monitoring Cloud Foundry with Prometheus
Notes:
you have not targeted this environment before do the following:
-e <YOUR_BOSH_HOST> --ca-cert root_ca_certificate alias-env myenv
-e myenv login
w you can upload the releases:
-e myenv upload-release https://bosh.io/d/github.com/bosh-prometheus/prometheus-boshrelease
-e myenv upload-release https://github.com/bosh-prometheus/node-exporter-boshrelease/releases/download/v3.0.0/node-exporter-3.0.0.tgz
You can find root_ca_certificate file on the OpsManager VM in `/var/tempest/workspaces/default/root_ca_certificate
`.
Key components of this BOSH release are firehose_exporter, bosh_exporter and cf_exporter which retrieve the data (from CF firehose, BOSH director and Cloud Controller API respectively) and present it in the Prometheus format. Each of those exporters require credentials to access the data source. IMPORTANT: these users have to be created in two different UAA instances. For the firehose and CF credentials, you use the main UAA instance of a Cloud Foundry deployment (where you would normally create users/clients, such as those for any other nozzles). For bosh_exporter however, you need to use the UAA which is colocated with the BOSH Director.
This process is explained here: https://github.com/bosh-prometheus/firehose_exporter
target https://uaa.SYSTEM_DOMAIN --skip-ssl-validation
token client get admin -s <YOUR ADMIN CLIENT SECRET>
client add prometheus-firehose \
name prometheus-firehose \
secret prometheus-client-secret \
authorized_grant_types client_credentials,refresh_token \
authorities doppler.firehose
client add prometheus-cf \
name prometheus-cf \
secret prometheus-client-secret \
authorized_grant_types client_credentials,refresh_token \
authorities cloud_controller.admin
Edit name and secret values. You will need to put them in the manifest later.
target https://BOSH_DIRECTOR:8443 --skip-ssl-validation
token owner get login -s Uaa-Login-Client-Credentials
name: admin
word: Uaa-Admin-User-Credentials
ac client add prometheus-bosh \
name prometheus-bosh \
secret prometheus-client-secret \
authorized_grant_types client_credentials,refresh_token \
authorities bosh.read \
scope bosh.read
Edit name and secret values. You will need to put them in the manifest later.
Given that PCF uses MySQL internally you should also monitor it. To do that, create a MySQL user and configure it in local.yml later.
-e myenv -d cf-........ ssh mysql/0
l -u root -p
r password: (OpsManager -> ERT -> Credentials -> Mysql Admin Credentials)
TE USER 'exporter' IDENTIFIED BY 'CHANGE_ME';
T PROCESS, REPLICATION CLIENT, SELECT ON *.* TO 'exporter' WITH MAX_USER_CONNECTIONS 3;
More information about mysqld_exporter is available here.
Since prometheus.yml is changing often to add more functionality (or to adjust it to the change in the bosh release itself) you don't have to edit it. Local configuration which needs to be adjusted is in the local.yml file. Edit URLs, credentials and everything else you need and then merge it with prometheus.yml. So the steps are:
interpolate prometheus.yml -l local.yml > manifest.yml
-d prometheus deploy manifest.yml
To generate VM passwords you can use: -e 'require "securerandom"; require "unix_crypt"; printf("%s\n", UnixCrypt::SHA512.build(SecureRandom.hex(16), SecureRandom.hex(8)))'
or (change MY_PASSWORD to the password you want):install passlib
on -c 'from passlib.hash import sha512_crypt as sc; print sc.encrypt("MY_PASSWORD", salt="random", relaxed=True)'
or (requires whois package installed on a Linux machines):sswd -s -m sha-512
-e myenv update-runtime-config runtime.yml
From now on, any VM (re)created by BOSH will be running node_exporter. The manifest is already prepared to consume that data.If the deployment was successful use `bosh vms
` to find out the IP address of your nginx server. Then connect:
There is a number of ready to use Dashboards that should install automatically. You can edit them in Grafana or create your own. They are coming from prometheus-boshrelease/src.
The prometheus-boshrelease
does include some predefined alerts for CloudFoundry as well as for BOSH. You can find the alert definitions in prometheus-boshrelease/src. Check the *.alerts
rule files in the corresponding folders. If you create new alerts make sure to add them to the prometheus.yml
- the path to the alert rule file as well as a job release for additional new exporters.
Access the AlertManager to see active alerts or silence them:
All configured rules as well as their current state can be viewed by accessing Prometheus:
Below and example config for prometheus.yml
to send alerts to slack:
me: alertmanager
release: prometheus
properties:
alertmanager:
receivers:
- name: default-receiver
slack_configs:
- api_url: https://hooks.slack.com/services/....
channel: 'slack-channel'
send_resolved: true
pretext: "text before the actual alert message"
text: "{{ .CommonAnnotations.description }}"
route:
receiver: default-receiver
To check your AlertManager configuration you can execute:
-H "Content-Type: application/json" -d '[{"labels":{"alertname":"TestAlert1"}}]' <alertmanager>:9093/api/v1/alerts
This should trigger a test alert.