reddit/cassandra-reaper

Name: cassandra-reaper

Owner: Reddit

Description: Automated Repair Awesomeness for Apache Cassandra

Forked from: thelastpickle/cassandra-reaper

Created: 2017-02-07 18:15:44.0

Updated: 2017-04-08 19:15:26.0

Pushed: 2017-04-12 23:28:57.0

Homepage:

Size: 1853

Language: Java

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Reaper for Apache Cassandra

Build Status

Note: This repo is a fork from the original Reaper project, created by the awesome folks at Spotify. The WebUI has been merged in with support for incremental repairs added.

Reaper is a centralized, stateful, and highly configurable tool for running Apache Cassandra repairs against single or multi-site clusters.

The current version supports running Apache Cassandra cluster repairs in a segmented manner, opportunistically running multiple parallel repairs at the same time on different nodes within the cluster. Basic repair scheduling functionality is also supported.

Reaper comes with a GUI, which if you're running in local mode can be at http://localhost:8080/webui/

Please see the Issues section for more information on planned development, and known issues.

System Overview

Reaper consists of a database containing the full state of the system, a REST-full API, and a CLI tool called spreaper that provides an alternative way to issue commands to a running Reaper instance. Communication with Cassandra nodes in registered clusters is handled through JMX.

Reaper system does not use internal caches for state changes regarding running repairs and registered clusters, which means that any changes done to the storage will reflect to the running system dynamically.

You can also run the Reaper with memory storage, which is not persistent, and is meant to be used only for testing purposes.

This project is built on top of Dropwizard: http://dropwizard.io/

Usage

To run Cassandra Reaper you need to simply build a project package using Maven, and then execute the created Java jar file, and give a path to the system configuration file as the first and only argument. You can also use the provided bin/cassandra-reaper script to run the service.

When using database based storage, you must setup a PostgreSQL database yourself and configure Reaper to use it, or use an embedded H2 database (set the appropriate configuration in the yaml file). The schema will get initialized/upgraded upon startup by FlyWay.

When using cassandra based storage, you must setup an Apache Cassandra database yourself and configure Reaper to use it. You need to create a keyspace and configure reaper to use it in the yaml file. The schema will be created by Reaper on the first run.

Reaper uses the dropwizard-cassandra bundle and full configuration reference is available here : https://github.com/composable-systems/dropwizard-cassandra

Find more information on how to use each storage backend in the Configuration section below.

For configuring the service, see the available configuration options in later section of this readme document.

You can call the service directly through the REST API using a tool like curl. You can also use the provided CLI tool in bin/spreaper to call the service.

Run the tool with -h or –help option to see usage instructions.

Notice that you can also build a Debian package from this project by using debuild, for example: debuild -uc -us -b

Configuration

An example testing configuration YAML file can be found from within this project repository: src/test/resources/cassandra-reaper.yaml

The configuration file structure is provided by Dropwizard, and help on configuring the server, database connection, or logging, can be found at: http://dropwizard.io/manual/configuration.html

Storage Backend

Cassandra Reaper can be used with either an ephemeral memory storage or persistent database. Running Reaper with memory storage, which is not persistent, means that all the registered clusters, column families, and repair runs will be lost upon service restart. The memory based storage is meant to be used for testing purposes only. Enable this type of storage by using the storageType: memory setting in your config file (enabled by default).

For persistent relational database storage, you must either setup PostgreSQL or H2. You also need to set storageType: database in the config file.

For persistent Apache Cassandra storage, you need to set storageType: cassandra in the config file. You'll also need to fill in the connection details to your Apache Cassandra cluster used to store the Reaper schema (reaper_db by default), in the cassandra: section of the yaml file.

A sample yaml file is available in the resource directory for each storage backend :

For configuring other aspects of the service, see the available configuration options in later section of this readme document.

Reaper Settings

The Reaper service specific configuration values are:

Notice that in the server section of the configuration, if you want to bind the service to all interfaces, use value “0.0.0.0”, or just leave the bindHost line away completely. Using “*” as bind value won't work.

Clusters with closed cross DC JMX ports

For security reasons, it is possible that Reaper will be able to access only a single DC nodes through JMX. The allowUnreachableNodes parameter in cassandra-reaper.yaml must then be set to true in order for Reaper to control the repair process through the reachable nodes only. Limitations of this setup are:

Leaving allowUnreachableNodes to false will prevent all repair sessions once a single node from the cluster is unreachable.

REST API

Source code for all the REST resources can be found from package com.spotify.reaper.resources.

Ping Resource
Cluster Resource
Repair Run Resource
Repair Schedule Resource
Building and running Reaper

To build Reaper without rebuilding the UI, run the following command :

nly regenerate the UI (requires npm and bower) : 

To rebuild both the UI and Reaper :

Running Reaper

r modifying the `resource/cassandra-reaper.yaml` config file, Reaper can be started using the following command line :

Once started, the UI can be accessed through : http://127.0.0.1:8080/webui/

Reaper can also be accessed using the REST API exposed on port 8080, or using the command line tool bin/spreaper


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.