Deploying Ceph in my soho servers

I spent 15 years workng as a sysadmin so i like to manage my own infrastructure.
I have some servers hosted in a datacenter covering my internet services but in order to test OpenStack Installers i like to use local VM's so i bought a headless desktop machine with a decent processor and a bunch of memory and set it up in a closet, then i had to buy another and another.
So now with three machines and a decent amount of terabytes of disk on each one it ocurred to me that i should cluster this disk space so i can forget about keeping track of the space that i'm using and be able to scale using cheap SoC like a Raspberry Pi with a big disk connected.
I was about to use swift but i a friend of mine told me that i should use Ceph, at first i was skeptical because i tested it 5 years ago with not very pleasant results. After some beers and a big list of features courtesy of my friend he convinced me and i decided to give it a try…

Confugure the nodes

Installation

Prepare Ceph Repo

Save this content into /etc/yum.repos.d/ceph.repo on each node

[ceph-noarch]
name=Ceph noarch packages
baseurl=http://ceph.com/rpm-hammer/el7/noarch
enabled=1
gpgcheck=1
type=rpm-md
gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc

Admin Node

[root@cloud /]# dnf install -y ceph ceph-deploy ceph-radosgw ceph-fuse

Nodes *

[root@cloud2 /]# dnf install -y ceph ceph-fuse ntp ntpdate ntp-doc
[root@cloud3 /]# dnf install -y ceph ceph-fuse ntp ntpdate ntp-doc

Create Ceph user on all Nodes

[root@cloud /]# groupadd ceph && useradd -m -c 'Ceph User' -d /home/ceph -s /bin/bash -g ceph cloud-storage
[root@cloud ~]# passwd cloud-storage # set temporary password
Changing password for user cloud-storage.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.
[root@cloud2 /]# groupadd ceph && useradd -m -c 'Ceph User' -d /home/ceph -s /bin/bash -g ceph cloud-storage
[root@cloud2 ~]# passwd cloud-storage # set temporary password
Changing password for user cloud-storage.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.
[root@cloud3 /]# groupadd ceph && useradd -m -c 'Ceph User' -d /home/ceph -s /bin/bash -g ceph cloud-storage
[root@cloud3 ~]# passwd cloud-storage # set temporary password
Changing password for user cloud-storage.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.

Set sudo privileges on all Nodes

[root@cloud /]# echo "cloud-storage ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cloud-storage
cloud-storage ALL = (root) NOPASSWD:ALL
[root@cloud /]# chmod 0440 /etc/sudoers.d/cloud-storage

Create SSH Key for passwordless login

Login as the cloud-storate user and create the key

[root@cloud /]# su - cloud-storage
cloud-storage@cloud .ssh$ ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/home/ceph/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/ceph/.ssh/id_rsa.
Your public key has been saved in /home/ceph/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:RYhCS3iot3eu4b/Tus3wRqQ3TZ6slGbgFceTIZryuE5 cloud-storage@cloud
The key's randomart image is:
+---[RSA 2048]----+
|  .o+.  ..o.     |
|  oo . o o  .    |
|   o. o * o..    |
|  .  . *.=oo  .  |
|      + S+  .. . |
|     E+...  .. ..|
|     +.+. ..  . o|
|    +.o.  ++.  o.|
|   o ..  +*+  .o*|
+----[SHA256]-----+

Copy the public key to the nodes

cloud-storage@cloud .ssh$ cat id_rsa.pub | ssh cloud2.soho 'mkdir .ssh; chmod 700 .ssh; cat - > .ssh/authorized_keys'
The authenticity of host 'cloud2.soho (192.168.1.10)' can't be established.
ECDSA key fingerprint is SHA256:PyRn3Jdkdh46dktuS6YgdkK7OKFpzpZdDJiTY46gsbF.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'cloud2.soho,192.168.1.10' (ECDSA) to the list of known hosts.
cloud-storage@cloud2.soho's password: 

Open Firewall port on the Nodes

[root@cloud2 ~]# firewall-cmd --zone=public --add-port=6789/tcp --permanent
success
[root@cloud2 ~]# firewall-cmd --zone=public --add-port=6800-7300/tcp --permanent
success

Disable SELinix

[root@cloud2 ~]# setenforce 0
[root@cloud2 ~]# sed -i 's/SELINUX=enforcing/SELINUX=permissive/' /etc/selinux/config # persistent config

Create the Cluster

You have to be logged in as the cloud-storage user

Create the cluster
I want my amin node to be also a common node

cloud-storage@cloud ~$ ceph-deploy new cloud.soho

Change the default number of replicas so ceph can be achieve an active + clean stat with just two OSDs

cloud-storage@cloud ~$ echo "osd pool default size = 2" >> ceph.conf

Ensure we will work on our LAN

cloud-storage@cloud ~$ echo "public network = 192.168.1.0/255.255.255.0" >> ceph.conf

Install Ceph

I'm using Fedra 22 and the radosgw package changed the name to ceph-radosgw so i had to create this little patch:

--- install.py  2015-06-14 17:41:18.646794848 -0600
+++ /usr/lib/python2.7/site-packages/ceph_deploy/hosts/fedora/install.py        2015-06-14 17:22:35.056856530 -0600
@@ -82,6 +82,6 @@
             '-q',
             'install',
             'ceph',
-            'radosgw',
+            'ceph-radosgw',
         ],
     )

Use the –no-adjust-repos in case you run into repository problems like me

cloud-storage@cloud ~$ sudo ceph-deploy install --no-adjust-repos  cloud.soho cloud2.soho cloud3.soho
cloud-storage@cloud ~$ ceph-deploy mon create-initial

Repeat for all the nodes and block devices you're planning to use

cloud-storage@cloud ~$ ceph-deploy disk zap cloud2.soho:sdb
cloud-storage@cloud ~$ ceph-deploy disk zap cloud2.soho:sdc
cloud-storage@cloud ~$ ceph-deploy osd prepare cloud2.soho:sdb:/dev/sdb
cloud-storage@cloud ~$ ceph-deploy osd prepare cloud2.soho:sdc:/dev/sdc
cloud-storage@cloud ~$ ceph-deploy osd activate cloud2.soho:sdb1
cloud-storage@cloud ~$ ceph-deploy osd activate cloud2.soho:sdc1

Set directories as osds on the first node

cloud-storage@cloud ~$ sudo mkdir -p /space/ceph/osd0
cloud-storage@cloud ~$ ceph-deploy osd prepare cloud.soho:/space/ceph/osd0
cloud-storage@cloud ~$ ceph-deploy osd activate  cloud.soho:/space/ceph/osd0

The create option is a shorcut for prepare and activate…

ceph-deploy osd create osdserver1:sdb:/dev/ssd1

Create a metadata server

cloud-storage@cloud ~$ ceph-deploy mds create cloud3.soho

Create the filesystem
This was particulary painful because the quick start documentation does not detail this step, i had to dig through the documentation and mailing lists to notice i had to do this.

ceph mds newfs 0 0 --yes-i-really-mean-it


Ceph Object Gateway

Ceph comes with an object gateway compatible with S3 and swift, the call it Rados Gateway (RGW).

To enable this feature you have to create a new instance of RGW:



Check the cluster

Check ceph satus:
You should see HEALTH_OK, if you don't check that:

  • The ODS are active: ceph osd stat, ceph osd tree
  • The monitor server is up: ceph mon stat
  • The metadata server is up and active: ceph mds stat
cloud-storage@cloud ~$ ceph health
HEALTH_OK
cloud-storage@cloud ~$ ceph -w
    cluster d7e6610b-07e6-49fa-8121-d05d72e85a0b
     health HEALTH_OK
     monmap e1: 1 mons at {cloud=192.168.1.68:6789/0}
            election epoch 1, quorum 0 cloud
     mdsmap e5: 1/1/1 up {0=cloud3.soho=up:active}
     osdmap e38434: 5 osds: 5 up, 5 in
      pgmap v58991: 64 pgs, 1 pools, 314 MB data, 100 objects
            271 GB used, 7931 GB / 8205 GB avail
                  64 active+clean

2015-06-21 13:32:04.919306 mon.0 [INF] pgmap v58991: 64 pgs: 64 active+clean; 314 MB data, 271 GB used, 7931 GB / 8205 GB avail

I was confused when i run ceph -w because i didn't had the metadata server active (without this you can't mount the cluster) and it didn't issue any warning, it just didn't showed it up. This might appear as obvious to the seasoned Ceph user but for a newbie it was a hard to find problem.

Mount your new cluster :)

# mount -t ceph 192.168.1.68:/ /mnt
# # mount | grep ceph
192.168.1.68:/ on /mnt type ceph (rw,relatime,nodcache,nofsc,acl)
#  df -h | grep mnt
192.168.1.68:/                   8.1T   274G  7.8T   4% /mnt

That's it, at first this appears to be very easy but if you are like me and like to poke under the hood while learning you might hit some brick walls, i had problems with the metadata server, the keyring and the version of Fedora i was using so i hope this comments help others not to fall into the same problems i had.

Troubleshooting

There are times in which the pg's have problems, most of the time they are cause by lack of conectvity between nodes but in case you get stale, inactive and/or unclean pg's you can run this little bash script i created to deal with this problems.

#!/bin/bash

ceph pg dump_stuck stale
ceph pg dump_stuck inactive
ceph pg dump_stuck unclean

Benchmarking

One of the things i like about Ceph is that it's very well documented, the https://wiki.ceph.com/Guides/How_To/Benchmark_Ceph_Cluster_Performance they have at their wiki is very good start point for doing this with good explanations of the techniques and tools they provide for benchmarking.

References