All you need to know about the Crush Map
We look into how to set up a good Crush Map and handle the placement and device types of your OSDs in your clusters. The crush map describes where, what type, how much and what is your failure domain or safety requirements.
One thing we will do a lot throughout this demonstration is to view the tree of our crush rules this can be done with the command
sudo ceph osd tree
The different types available are osd, host, chassis, rack, row, pdu (power distribution unit), pod, room, datacenter, zone, region, root.
In order to change a running cluster we could change the different levels by adding, moving and removing parts.
sudo ceph osd crush add-bucket rack1 rack
sudo ceph osd crush move host1 datacenter=dc1 room=room1 row=row1 rack=rack1
sudo ceph osd crush remove rack1
Adding a line to the ceph.conf
file in order to place your OSDs on creation is an easy way of manage deployments of larger clusters.
crush location hook = /var/lib/ceph/customized-ceph-crush-location
We will create and make this file runnable by
vi customized-ceph-crush-location
chmod +x customized-ceph-crush-location
The contents of the file will set the host and rack by reading the name of the host and the file /etc/rack
.
#!/bin/sh
echo "host=$(hostname -s) rack=$(cat /etc/rack) root=default"
Another thing you can do is move OSDs around. With the commands below you can set the specific position of an OSD, reweight it's importance or remove it from the crush map all together.
sudo ceph osd crush set osd.0 1.0 root=default datacenter=dc1 room=room1 row=foo rack=bar host=foo-bar-1
sudo ceph osd crush reweight osd.0 0.122
sudo ceph osd crush remove osd.0
Another parameter we can work with is the device type which is more of a hardware level change and in some cases the type is not defined yet so we need to set it ourself. Usually it's set on startup to either hdd, ssd or nvme.
sudo ceph osd crush set-device-class nvme osd.0
sudo ceph osd crush rm-device-class osd.0
We can also list the available rules to apply on our crush map
sudo ceph osd crush rule ls
Here we create two rules. One to split over different racks another to put all of the content in the pools on a specific type of device split over hosts.
sudo ceph osd crush rule create-replicated rack_split default rack
sudo ceph osd crush rule create-replicated nmve_drives default host nvme
We can also remove rules we don't use with
sudo ceph osd crush rule rm [rule-name]
Applying a rule to a pool is done with this command
sudo ceph osd pool set <pool-name> crush_rule nmve_drives
Removing OSD from cluster
sudo systemctl stop ceph-osd@[id]
sudo ceph osd purge [id] --yes-i-really-mean-it
sudo ceph-volume lvm zap [device_path] --destroy
Adding OSD to cluster
sudo ceph-volume lvm prepare --data [device_path]
sudo ceph-volume lvm activate [id] [uuid]