This workshop walks users through setting up an 8-node Ceph cluster and mounting a block device, using a CephFS mount, and storing a blob object. Jepsen test is conducted for ceph on the three osd nodes, osd-1 .. 3
It follows the following Ceph user guides:
- Preflight checklist
- Storage cluster quick start
- Block device quick start
- Ceph FS quick start
- Install Ceph object gateway
- Configuring Ceph object gateway
Note that after many commands, you may see something like:
Unhandled exception in thread started by
sys.excepthook is missing
lost sys.stderr
I'm not sure what this means, but everything seems to have completed successfully, and the cluster will work.
Install Vagrant and a provider such as VirtualBox.
We'll also need the vagrant-cachier and vagrant-hostmanager plugins:
the VM has trouble with apt-get, user _apt etc. Change the Vagrantfile to change. From
fgrehm/vagrant-cachier#175
$ vagrant plugin install vagrant-cachier
$ vagrant plugin install vagrant-hostmanager
Since the admin machine will need the Vagrant SSH key to log into the server machines, we need to add it to our local SSH agent:
On Mac:
$ ssh-add -K ~/.vagrant.d/insecure_private_key
On *nix:
$ ssh-add -k ~/.vagrant.d/insecure_private_key
no agent for jepsen, gator
This instructs Vagrant to start the VMs and install ceph-deploy
on the admin machine.
$ vagrant up
We'll create a simple cluster and make sure it's healthy. Then, we'll expand it.
First, we need to get an interactive shell on the admin machine:
$ vagrant ssh ceph-admin
The ceph-deploy
tool will write configuration files and logs to the current directory. So, let's create a directory for the new cluster:
vagrant@ceph-admin:~$ mkdir test-cluster && cd test-cluster
Let's prepare the machines:
vagrant@ceph-admin:~/test-cluster$ ceph-deploy new mon-1 mon-2 mon-3
Now, we have to change a default setting. For our initial cluster, we are only going to have three object storage daemons. We need to tell Ceph to allow us to achieve an active + clean
state with just three Ceph OSDs. Add osd pool default size = 3
to ./ceph.conf
.
Because we're dealing with multiple VMs sharing the same host, we can expect to see more clock skew. We can tell Ceph that we'd like to tolerate slightly more clock skew by adding the following section to ceph.conf
:
mon_clock_drift_allowed = 1
Important: As of the Jewel release of Ceph, the Ceph team recommends using XFS instead of ext4. Unfortunately, I couldn't find an easy and standard way in the Vagrantfile to attach an extra volume to use for Ceph. For more information, see the Ceph docs and multinode-ceph-vagrant/#15.
In the meantime, there is a workaround if we add the following to ceph.conf
:
osd max object name len = 256
osd max object namespace len = 64
After these few changes, the file should look similar to:
[global]
fsid = 7acac25d-2bd8-4911-807e-e35377e741bf
mon_initial_members = mon-1, mon-2, mon-3
mon_host = 172.21.12.12,172.21.12.13,172.21.12.14
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 3
mon_clock_drift_allowed = 1
osd max object name len = 256
osd max object namespace len = 64
We're finally ready to install!
Note here that we specify the Ceph release we'd like to install, which is kraken.
vagrant@ceph-admin:~/test-cluster$ ceph-deploy install --release=kraken ceph-admin mon-1 mon-2 mon-3 osd-1 osd-2 osd-3 ceph-client
Next, we add a monitor node:
vagrant@ceph-admin:~/test-cluster$ ceph-deploy mon create-initial
And our two OSDs. For these, we need to log into the server machines directly:
vagrant@ceph-admin:~/test-cluster$ ssh osd-1 "sudo mkdir /var/local/osd0 && sudo chown ceph:ceph /var/local/osd0"
vagrant@ceph-admin:~/test-cluster$ ssh osd-2 "sudo mkdir /var/local/osd1 && sudo chown ceph:ceph /var/local/osd1"
vagrant@ceph-admin:~/test-cluster$ ssh osd-3 "sudo mkdir /var/local/osd2 && sudo chown ceph:ceph /var/local/osd2"
Now we can prepare and activate the OSDs:
vagrant@ceph-admin:~/test-cluster$ ceph-deploy osd prepare osd-1:/var/local/osd0 osd-2:/var/local/osd1 osd-3:/var/local/osd2
vagrant@ceph-admin:~/test-cluster$ ceph-deploy osd activate osd-1:/var/local/osd0 osd-2:/var/local/osd1 osd-3:/var/local/osd2
We can copy our config file and admin key to all the nodes, so each one can use the ceph
CLI.
vagrant@ceph-admin:~/test-cluster$ ceph-deploy admin ceph-admin mon-1 mon-2 mon-3 osd-1 osd-2 osd-3 ceph-client
We also should make sure the keyring is readable:
vagrant@ceph-admin:~/test-cluster$ sudo chmod +r /etc/ceph/ceph.client.admin.keyring
vagrant@ceph-admin:~/test-cluster$ ssh mon-1 sudo chmod +r /etc/ceph/ceph.client.admin.keyring
vagrant@ceph-admin:~/test-cluster$ ssh mon-2 sudo chmod +r /etc/ceph/ceph.client.admin.keyring
vagrant@ceph-admin:~/test-cluster$ ssh mon-3 sudo chmod +r /etc/ceph/ceph.client.admin.keyring
Finally, check on the health of the cluster:
vagrant@ceph-admin:~/test-cluster$ ceph health
You should see something similar to this once it's healthy:
vagrant@ceph-admin:~/test-cluster$ ceph health
HEALTH_OK
vagrant@ceph-admin:~/test-cluster$ ceph -s
cluster 18197927-3d77-4064-b9be-bba972b00750
health HEALTH_OK
monmap e2: 3 mons at {mon-1=172.21.12.12:6789/0,mon-2=172.21.12.13:6789/0,mon-3=172.21.12.14:6789/0}, election epoch 6, quorum 0,1,2 mon-1,mon-2,mon-3
osdmap e9: 2 osds: 2 up, 2 in
pgmap v13: 192 pgs, 3 pools, 0 bytes data, 0 objects
12485 MB used, 64692 MB / 80568 MB avail
192 active+clean
Notice that we have three OSDs (osdmap e9: 3 osds: 3 up, 3 in
) and all of the placement groups (pgs) are reporting as active+clean
.
Congratulations!
To more closely model a production cluster, we're going to add one more OSD daemon and a Ceph Metadata Server. We'll also add monitors to all hosts instead of just one.
vagrant@ceph-admin:~/test-cluster$ ssh osd-3 "sudo mkdir /var/local/osd2 && sudo chown ceph:ceph /var/local/osd2"
Now, from the admin node, we prepare and activate the OSD:
vagrant@ceph-admin:~/test-cluster$ ceph-deploy osd prepare osd-3:/var/local/osd2
vagrant@ceph-admin:~/test-cluster$ ceph-deploy osd activate osd-3:/var/local/osd2
Watch the rebalancing:
vagrant@ceph-admin:~/test-cluster$ ceph -w
You should eventually see it return to an active+clean
state, but this time with 4 OSDs:
vagrant@ceph-admin:~/test-cluster$ ceph -w
cluster 18197927-3d77-4064-b9be-bba972b00750
health HEALTH_OK
monmap e2: 3 mons at {mon-1=172.21.12.12:6789/0,ceph-server-2=172.21.12.13:6789/0,ceph-server-3=172.21.12.14:6789/0}, election epoch 30, quorum 0,1,2 mon-1,mon-2,mon-3
osdmap e38: 4 osds: 4 up, 4 in
pgmap v415: 192 pgs, 3 pools, 0 bytes data, 0 objects
18752 MB used, 97014 MB / 118 GB avail
192 active+clean
Let's add a metadata server to server1:
vagrant@ceph-admin:~/test-cluster$ ceph-deploy mds create mon-1
We add monitors to servers 2 and 3.
vagrant@ceph-admin:~/test-cluster$ ceph-deploy mon create ceph-server-2 ceph-server-3
Watch the quorum status, and ensure it's happy:
vagrant@ceph-admin:~/test-cluster$ ceph quorum_status --format json-pretty
TODO
Now that we have everything set up, let's actually use the cluster. We'll use the ceph-client machine for this.
$ vagrant ssh ceph-client
vagrant@ceph-client:~$ sudo rbd create foo --size 4096 -m mon-1 --image-feature layering
vagrant@ceph-client:~$ sudo rbd map foo --pool rbd --name client.admin -m mon-1
vagrant@ceph-client:~$ sudo mkfs.ext4 -m0 /dev/rbd/rbd/foo
vagrant@ceph-client:~$ sudo mkdir /mnt/ceph-block-foo
vagrant@ceph-client:~$ sudo mount /dev/rbd/rbd/foo /mnt/ceph-block-foo
TODO
TODO
After rbd is done, do git clone gator1/jepsen.git and has a jepsen directory under /home/vagrant.
Jepsen needs root access to the osd nodes. Jepsen control node is ceph-client node.
This could be done automatically but I did manaully.
first change the root password (I don't know VM's root password, VMs are ubuntu).
for ceph-client, osd-1 .. 3 do:
'sudo passwd root' and use 'root' as password.
also need to enable root ssh access:
modify /etc/ssh/sshd_config for PermitRootLogin=yes
sudo systemctl restart sshd.service
from ceph-client: su root to login as root.
'ssh-keygen -t rsa -N ""' to generate for root. The files id_rsa and id_ras.pub are generated in /root/.ssh
ssh-copy-id root@osd-1
ssh-copy-id root@osd-2
ssh-copy-id root@osd-3
from /root/.ssh
scp id_rsa root@osd-1:/root/.ssh
scp id_rsa root@osd-2:/root/.ssh
scp id_rsa root@osd-3:/root/.ssh
rm known_hosts
ssh-keyscan -t rsa osd-1 >> ~/.ssh/known_hosts
ssh-keyscan -t rsa osd-2 >> ~/.ssh/known_hosts
ssh-keyscan -t rsa osd-3 >> ~/.ssh/known_hosts
jepsen seems to hard code for iptables for eth0 but VM uses diff interface names.
When you're all done, tell Vagrant to destroy the VMs.
$ vagrant destroy -f