Name: pg_octopus
Owner: Citus Data
Description: A health checker for PostgreSQL
Created: 2016-02-01 11:24:01.0
Updated: 2017-10-12 13:05:01.0
Pushed: 2016-04-02 15:11:25.0
Homepage: null
Size: 15
Language: C
GitHub Committers
User | Most Recent Commit | # Commits |
---|
Other Committers
User | Most Recent Commit | # Commits |
---|
pg_octopus is an extension for PostgreSQL that health checks a cluster of PostgreSQL nodes in the background.
The easiest way to install pg_octopus is to build the sources from GitHub.
git clone https://github.com/citusdata/pg_octopus.git
cd pg_octopus
PATH=/usr/local/pgsql/bin/:$PATH make
sudo PATH=/usr/local/pgsql/bin/:$PATH make install
After installing the extension, run the following in psql:
CREATE EXTENSION pg_octopus;
pg_octopus uses a background worker to perform health checks. To activate the background worker, add pg_octopus to the shared_preload_libraries in postgresql.conf and restart postgres.
# in postgresql.conf
shared_preload_libraries = 'pg_octopus'
To create a health check for a server, simple add its address to the octopus.nodes table.
postgres=# INSERT INTO octopus.nodes VALUES ('10.192.0.247', 5432);
INSERT 0 1
postgres=# SELECT * FROM octopus.nodes;
node_name | node_port | health_status
--------------+-----------+---------------
10.192.0.246 | 5432 | 0
10.192.0.247 | 5432 | 1
(2 rows)
In the health_status column, 1 means health, 0 means unhealthy, -1 means unknown.
pg_octopus performs health-checks in rounds of 'health_check_period' by trying to connect to nodes using libpq. If it fails to connect within 'health_check_timeout', it tries again after 'health_check_retry_delay' for at most 'health_check_max_retries' times. The default configuration values are shown below.
pg_octopus.health_check_period = 10000 # round duration (in ms)
pg_octopus.health_check_timeout = 2000 # connection timeout (in ms)
pg_octopus.health_check_max_retries = 2 # maximum number of re-tries
pg_octopus.health_check_retry_delay = 1000 # time between consecutive re-tries (in ms)
Note that health_check_timeout + health_check_max_retries * (health_check_retry_delay + health_check_timeout) should be smaller than health_check_period.