elodina/ceph-mesos

Name: ceph-mesos

Owner: Elodina

Description: A Mesos framework for scaling a Ceph cluster.

Created: 2016-04-14 04:25:07.0

Updated: 2016-04-14 04:25:08.0

Pushed: 2016-03-06 07:57:23.0

Homepage: null

Size: 113

Language: C++

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Ceph on Apache Mesos

A Mesos framework utilizing Docker for scaling a Ceph cluster. It aims to be a fast and reliable solution.
For now, it can only:

Goal & Roadmap

Scaling and monitoring large Ceph cluster in production Mesos environment in an easy way is our goal. And it's in progress. Check below for updates ( Your ideas are welcome ).

Prerequisites
  1. A Mesos cluster with Docker installed (duh). We only support CentOS 7 distribution at present and requires at least 1 slave
  2. Slaves in Mesos have network connection to download Docker images
  3. Install libmicrohttpd in all slaves
    -y install libmicrohttpd
    
    Mesos Resource Configuration(Optional)

Ceph-Mesos supports “role” setting so you can constrain ceph cluster's resource through Mesos's role or resource configuration. For instance, you want 10 slaves in the Mesos cluster to be role “ceph” in order to deploy ceph-mesos on them.

NOTE: You can also set “resource” configuration instead of “role” to reserve resources

 "cpus(*):8;cpus(ceph):4;mem(*):16384;mem(ceph):8192" > /etc/mesos-slave/resources

Detailed configurations please refer to:
https://open.mesosphere.com/reference/mesos-master/
https://open.mesosphere.com/reference/mesos-slave/

Build Ceph-Mesos

Pick a host to setup the build environment. Make sure the host can access your Mesos cluster.

 rpm -Uvh http://repos.mesosphere.io/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm
 yum install -y epel-release
 yum groupinstall -y "Development Tools"
 yum install -y cmake mesos protobuf-devel boost-devel gflags-devel glog-devel yaml-cpp-devel  jsoncpp-devel libmicrohttpd-devel gmock-devel gtest-devel

Then clone ceph-mesos and build.

clone 
eph-mesos
r build
uild
e ..

After that, you'll see “ceph-mesos”, “ceph-mesos-executor”, “ceph-mesos-tests” , “cephmesos.yml” , “ceph.conf” and “cephmesos.d” in the build directory.

Run Ceph-Mesos

Configure the cephmesos.yml and cephmesos.d/{hostname}.yml before you go if you want Ceph-Mesos to prepare the disks.

Ceph-Mesos needs to know which disks are available for OSD launching. Here comes cephmesos.yml and cephmesos.d/{hostname}.yml. Cephmesos.yml is for common subset settings of hosts and cephmesos.d/{hostname}.yml is particular settings of a dedicated host. Once the osddevs and jnldevs are populated, Ceph-Mesos will make partitions and filesystem on the disks and bind-mount them for OSD containers to use.

For instance, assume we have 5 slaves. 4 of them have same one disk “sdb” when execute “fdisk -l”, but slave5 have another “sdc”. So we need to create a cephmesos.d/slave5.yml which have addition “sdc” in field “osddevs”. In this situation, Ceph-Mesos can use “sdc” to launch containers in slave5, but others only have “sdb”.

And you must populate the id, role, master, mgmtdev and datadev field, can leave other field default. Sample configurations are as follows:

cephmesos.yml:

        myceph
:       ceph
er:     zk://mm01:2181,mm02:2181,mm03:2181/mesos
eeper:  ""
port:   8889
port:   8888
root:   ./
dev:    "192.168.0.0/16" #public network CIDR
dev:    "10.1.0.0/16"    #cluster network CIDR
evs:
sdb
sdc
evs:
sde
arts:   2                #all jnldevs will be parted to 2 partitions for OSDs to use

cephmesos.d/slave5.yml:

evs:
sdb
sdc
sdd
evs:
sde
arts:   3

Above sample configurations will indicate ceph-mesos to deploy a ceph cluster on Mesos salves which have role “ceph”. The cluster will have public network CIDR “192.168.0.0/16” and cluster network CIDR “10.1.0.0/16”. And slave5 will make 3 journal partitions on sde for [sdb-d] to use when launching OSD( for instance, one OSD's data on sdb1 with journal sde3; Another OSD will use sdc1 and sde2 ). But the other slaves will make 2 journal partitions on sde for [sdb-c].

After finishing all these configurations, we can start ceph-mesos using below command:

ph-mesos -config cephmesos.yml

You can check the Mesos web console to see your ceph cluster now. After about 10 mins(depend on your network speed), you'll see 5 active tasks running there.

NOTE: Currently, if “ceph-meos” (the scheduler) stops, all containers will be removed. And restart “ceph-mesos” will clear your data( in slave's “~/ceph_config_root” which is bind-mounted by Docker container) and start a new ceph cluster. We will improve this in the near future.

Launch new OSD(s)

ceph-mesos can accept json format request and start new OSD(s) if there are available hosts.

 -d '{"instances":2,"profile":"osd"}' http://ceph_scheduler_host:8889/api/cluster/flexup
Verify your Ceph cluster

You can ssh to a Mesos slave running a ceph-mesos task and execute docker commands.

'll probably see a osd0 container running
er ps -a 
er exec osd0 ceph -s

Now you can verify your Ceph cluster!

Performance

We have done some basic performance testing using fio. Comparing with physical ceph cluster, Ceph-Meosos shows no performance lost and sometimes even better( we are digging into why it's better ).


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.