yahoo/viper

Name: viper

Owner: Yahoo Inc.

Description: Viper is a utility that returns a live host from a list of host names.

Created: 2016-07-28 21:58:22.0

Updated: 2017-09-25 13:39:57.0

Pushed: 2017-09-27 05:02:19.0

Homepage:

Size: 102

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Build Status Coverage Status Download

Viper

Viper is a utility that returns a live host from a list of host names. Viper will continuously monitor each host name and report when any hosts become unresponsive. Several policies are available to pick one live host from the list:

Usage
rt com.yahoo.viper.*;


reate list of hosts to monitor
<HostInfo> hosts = Arrays.asList(
    new HostInfo("yahoo.com", 80),
    new HostInfo("http://google.com"),
    new HostInfo("mysql.db", 3365));

ing the hosts every 5 seconds and return live hosts in round robin fashion
checkPeriodMs = 5000;
retries = 0;
Monitor hmonitor = new HostMonitor("Name", LoadBalancingPolicy.ROUND_ROBIN, checkPeriodMs, retries);

et a live host
Info host = hmonitor.liveHost();
Logging

The logging output has been carefully crafted to provide useful information with as little noise as possible.

The server Tool

The server tool is used to simulate server conditions in order to test the monitoring library. Start the tool using the following command.

n/server
ands:
ort> up - handle requests to the port
ort> down - stop handling requests to the port
ort> hang - hang requests to the port
ort> error - fail requests to the port

To create two servers listenting on port 2000 and 2001, type

 up
 up

To cause one to hang and another to error, type

 hang
 error

To restore one of the servers, type

 up
The monitor Tool

The monitor tool is a convenient way to test the monitoring library on one or more servers. It can be used on the simulated servers created by the server tool or on real server instances (or both).

The monitor tool accepts three types of host specifications:

Here's an example of using the tool to monitor two hosts. The output is annotated with comments, which are prefixed with #.

n/monitor http://localhost:2000 2001

l is well. This message is printed at least once a minute so you know the library is running fine
-02-12 23:49:25,256 [INFO] All hosts are up. (period=500ms). (120 checks)

e monitor tool fetches a live host and prints what it fetched
-02-12 23:49:26,635 [INFO] monitor: host localhost/127.0.0.1:2001 is live

tice that the next live host fetched is the other host in the list
-02-12 23:49:36,635 [INFO] monitor: host http://localhost:2000 is live

 the server tool, the command "2000 hung" was issued. The following status will be printed
 least once a minute while the situation remains
-02-12 23:49:41,840 [WARN] 1 out of 2 hosts are not live: http://localhost:2000(hung)

e monitor tool registered for status updates from the library. This is the notification
ssage (along with other information as well; see docs).
-02-12 23:49:41,840 [WARN] monitor: 1 out of 2 hosts are not live: http://localhost:2000(hung)
-02-12 23:49:46,640 [INFO] monitor: host localhost/127.0.0.1:2001 is live
-02-12 23:49:56,641 [INFO] monitor: host localhost/127.0.0.1:2001 is live
-02-12 23:50:01,949 [INFO] localhost/127.0.0.1:2001: Connection refused

 the server tool, the command "2001 down" was issued. Notice that the logging level is ERROR
cause all hosts are unavailable.
-02-12 23:50:02,453 [ERROR] All 2 hosts are not live: http://localhost:2000(hung) localhost/127.0.0.1:2001
-02-12 23:50:02,453 [ERROR] monitor: All 2 hosts are not live: http://localhost:2000(hung) localhost/127.0.0.1:2001
-02-12 23:50:06,644 [INFO] monitor: no live hosts
-02-12 23:50:16,649 [INFO] monitor: no live hosts

is is the error being emitted by localhost:2001.  Notice that if the errors are suppressed
 they are the same, to avoid polluting the logs. The number of suppressions is printed.
-02-12 23:50:20,563 [INFO] localhost/127.0.0.1:2001: Connection refused [logged 36 times]

 the server tool, the command "2001 up" was issued
-02-12 23:50:20,563 [INFO] Thread[Checker-9,5,main] localhost/127.0.0.1:2001: is now live
-02-12 23:50:21,065 [WARN] 1 out of 2 hosts are not live: http://localhost:2000(hung)
-02-12 23:50:21,066 [WARN] monitor: 1 out of 2 hosts are not live: http://localhost:2000(hung)
-02-12 23:50:26,652 [INFO] monitor: host localhost/127.0.0.1:2001 is live
-02-12 23:50:32,498 [INFO] http://localhost:2000: Unexpected end of file from server

 the server tool, the command "2000 up" was issued
-02-12 23:50:32,644 [INFO] Thread[Checker-9,5,main] http://localhost:2000: is now live

l is well again
-02-12 23:50:33,146 [INFO] All hosts are up. (period=500ms). (24 checks)
-02-12 23:50:33,147 [INFO] monitor: All hosts are up. (period=500ms)
-02-12 23:50:36,655 [INFO] monitor: host http://localhost:2000 is live
-02-12 23:50:46,655 [INFO] monitor: host localhost/127.0.0.1:2001 is live

This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.