From 61b75d7980e9251941f3eaf735eb147615392cad Mon Sep 17 00:00:00 2001 From: Francesco Pantano Date: Mon, 13 May 2024 17:21:13 +0200 Subject: [PATCH] Rework Ceph RBD migration documentation This patch represents a rework of the current RBD documentation to move it from a POC to a procedure that we can test in CI. In particular: - the procedure is split between Ceph Mgr and Ceph Mons migration - Ceph MGR and Mon docs are more similar to procedures that the user should follow - the order is fixed as rbd should be last Signed-off-by: Francesco Pantano --- .../assembly_migrating-ceph-rbd.adoc | 14 +- docs_user/main.adoc | 8 +- ...c_migrating-mgr-from-controller-nodes.adoc | 100 +++++ ...c_migrating-mon-from-controller-nodes.adoc | 411 ++++++++++++++++++ 4 files changed, 527 insertions(+), 6 deletions(-) create mode 100644 docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc create mode 100644 docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc diff --git a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc index 9ddf940d7..7269d405f 100644 --- a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc +++ b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc @@ -11,6 +11,16 @@ To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must * {Ceph} is running version 6 or later and is managed by cephadm/orchestrator. * NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based deployment to cephadm. For more information, see xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha cluster]. * Both the {Ceph} public and cluster networks are propagated, with {OpenStackPreviousInstaller}, to the target nodes. -* Ceph Monitors need to keep their IPs to avoid cold migration. +* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have been migrated already to the target nodes; +ifeval::["{build}" != "upstream"] +* The daemons distribution follows the cardinality constraints described in the doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations] +endif::[] +* The Ceph cluster is healthy, and the `ceph -s` command returns `HEALTH_OK` +* The procedure keeps the mon IP addresses by moving them to the {Ceph} nodes +* Drain the existing Controller nodes +* Deploy additional monitors to the existing nodes, and promote them as +_admin nodes that administrators can use to manage the {CephCluster} cluster and perform day 2 operations against it. -include::../modules/proc_migrating-mon-and-mgr-from-controller-nodes.adoc[leveloffset=+1] +include::../modules/proc_migrating-mgr-from-controller-nodes.adoc[leveloffset=+1] + +include::../modules/proc_migrating-mon-from-controller-nodes.adoc[leveloffset=+1] diff --git a/docs_user/main.adoc b/docs_user/main.adoc index ac3dd21c1..4c7652b04 100644 --- a/docs_user/main.adoc +++ b/docs_user/main.adoc @@ -22,12 +22,12 @@ include::assemblies/assembly_adopting-openstack-control-plane-services.adoc[leve include::assemblies/assembly_adopting-the-data-plane.adoc[leveloffset=+1] -include::assemblies/assembly_migrating-ceph-rbd.adoc[leveloffset=+1] +include::assemblies/assembly_migrating-the-object-storage-service.adoc[leveloffset=+1] -include::assemblies/assembly_migrating-ceph-rgw.adoc[leveloffset=+1] +include::assemblies/assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1] include::modules/proc_migrating-ceph-mds.adoc[leveloffset=+1] -include::assemblies/assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1] +include::assemblies/assembly_migrating-ceph-rgw.adoc[leveloffset=+1] -include::assemblies/assembly_migrating-the-object-storage-service.adoc[leveloffset=+1] +include::assemblies/assembly_migrating-ceph-rbd.adoc[leveloffset=+1] diff --git a/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc b/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc new file mode 100644 index 000000000..0941d623b --- /dev/null +++ b/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc @@ -0,0 +1,100 @@ +[id="migrating-mgr-from-controller-nodes_{context}"] + += Migrating Ceph Mgr daemons to {Ceph} nodes + +The following section describes how to move Ceph Mgr daemons from the +OpenStack controller nodes to a set of target nodes. Target nodes might be +pre-existing {Ceph} nodes, or OpenStack Compute nodes if Ceph is deployed by +{OpenStackPreviousInstaller} with an HCI topology. + +.Prerequisites + +Configure the target nodes (CephStorage or ComputeHCI) to have both `storage` +and `storage_mgmt` networks to ensure that you can use both {Ceph} public and +cluster networks from the same node. This step requires you to interact with +{OpenStackPreviousInstaller}. From {rhos_prev_long} {rhos_prev_ver} and later +you do not have to run a stack update. + +.Procedure + +This procedure assumes that cephadm and the orchestrator are the tools that +drive the Ceph Mgr migration. As done with the other Ceph daemons (MDS, +Monitoring and RGW), the procedure uses the Ceph spec to modify the placement +and reschedule the daemons. Ceph Mgr is run in an active/passive fashion, and +it's also responsible to provide many modules, including the orchestrator. + +. Before start the migration, ssh into the target node and enable the firewall +rules required to reach a Mgr service. +[source,bash] ++ +---- +dports="6800:7300" +ssh heat-admin@ sudo iptables -I INPUT \ + -p tcp --match multiport --dports $dports -j ACCEPT; +---- + +[NOTE] +Repeat the previous action for each target_node. + +. Check the rules are properly applied and persist them: ++ +[source,bash] +---- +sudo iptables-save +sudo systemctl restart iptables +---- + +. Prepare the target node to host the new Ceph Mgr daemon, and add the `mgr` +label to the target node: ++ +[source,bash] +---- +ceph orch host label add mgr; done +---- + +- Replace with the hostname of the hosts listed in the {Ceph} + through the `ceph orch host ls` command. + +Repeat this action for each node that will be host a Ceph Mgr daemon. + +Get the Ceph Mgr spec and update the `placement` section to use `label` as the +main scheduling strategy. + +. Get the Ceph Mgr spec: ++ +[source,yaml] +---- +sudo cephadm shell -- ceph orch ls --export mgr > mgr.yaml +---- + +.Edit the retrieved spec and add the `label: mgr` section: ++ +[source,yaml] +---- +service_type: mgr +service_id: mgr +placement: + label: mgr +---- + +. Save the spec in `/tmp/mgr.yaml` +. Apply the spec with cephadm using the orchestrator: ++ +---- +sudo cephadm shell -m /tmp/mgr.yaml -- ceph orch apply -i /mnt/mgr.yaml +---- + +According to the numner of nodes where the `mgr` label is added, you will see a +Ceph Mgr daemon count that matches the number of hosts. + +. Verify new Ceph Mgr have been created in the target_nodes: ++ +---- +ceph orch ps | grep -i mgr +ceph -s +---- ++ +[NOTE] +The procedure does not shrink the Ceph Mgr daemons: the count is grown by the +number of target nodes, and the xref:migrating-mon-from-controller-nodes[Ceph Mon migration procedure] +will decommission the stand-by Ceph Mgr instances. diff --git a/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc b/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc new file mode 100644 index 000000000..ff768b33f --- /dev/null +++ b/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc @@ -0,0 +1,411 @@ +[id="migrating-mon-from-controller-nodes_{context}"] + += Migrating Ceph Monitor daemons to {Ceph} nodes + +The following section describes how to move Ceph Monitor daemons from the +OpenStack controller nodes to a set of target nodes. Target nodes might be +pre-existing {Ceph} nodes, or OpenStack Compute nodes if Ceph is deployed by +{OpenStackPreviousInstaller} with an HCI topology. + +.Prerequisites + +Configure the target nodes (CephStorage or ComputeHCI) to have both `storage` +and `storage_mgmt` networks to ensure that you can use both {Ceph} public and +cluster networks from the same node. This step requires you to interact with +{OpenStackPreviousInstaller}. From {rhos_prev_long} {rhos_prev_ver} and later +you do not have to run a stack update. However, there are commands that you +must perform to run `os-net-config` on the bare metal node and configure +additional networks. + +.. If target nodes are `CephStorage`, ensure that the network is defined in the +`metalsmith.yaml` for the CephStorageNodes: ++ +[source,yaml] +---- + - name: CephStorage + count: 2 + instances: + - hostname: oc0-ceph-0 + name: oc0-ceph-0 + - hostname: oc0-ceph-1 + name: oc0-ceph-1 + defaults: + networks: + - network: ctlplane + vif: true + - network: storage_cloud_0 + subnet: storage_cloud_0_subnet + - network: storage_mgmt_cloud_0 + subnet: storage_mgmt_cloud_0_subnet + network_config: + template: templates/single_nic_vlans/single_nic_vlans_storage.j2 +---- + +.. Run the following command: ++ +---- +openstack overcloud node provision \ + -o overcloud-baremetal-deployed-0.yaml --stack overcloud-0 \ + --network-config -y --concurrency 2 /home/stack/metalsmith-0.yam +---- + +.. Verify that the storage network is configured on the target nodes: ++ +---- +(undercloud) [stack@undercloud ~]$ ssh heat-admin@192.168.24.14 ip -o -4 a +1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever +5: br-storage inet 192.168.24.14/24 brd 192.168.24.255 scope global br-storage\ valid_lft forever preferred_lft forever +6: vlan1 inet 192.168.24.14/24 brd 192.168.24.255 scope global vlan1\ valid_lft forever preferred_lft forever +7: vlan11 inet 172.16.11.172/24 brd 172.16.11.255 scope global vlan11\ valid_lft forever preferred_lft forever +8: vlan12 inet 172.16.12.46/24 brd 172.16.12.255 scope global vlan12\ valid_lft forever preferred_lft forever +---- + +.Procedure + +This procedure assumes that some of the steps are run on the source node that +we want to decommission, while other steps are run on the target node that is +supposed to host the redeployed daemon. + +. Before start the migration, ssh into the target node and enable the firewall +rules required to reach a Mon service: ++ +---- +for port in 3300 6789; { + ssh heat-admin@ sudo iptables -I INPUT \ + -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \ + -j ACCEPT; +} +---- + +- Replace with the hostname of the node that is supposed to host +the new mon + +. Check the rules are properly applied and persist them: ++ +---- +sudo iptables-save +sudo systemctl restart iptables +---- + +. To migrate the existing Mons to the target {Ceph} nodes, the first step is to +create the following {Ceph} spec from the first mon (or the first controller) +and modify the placement based on the appropriate label. ++ +[source,yaml] +---- +service_type: mon +service_id: mon +placement: + label: mon +---- + +. Save the spec in /tmp/mon.yaml +. Apply the spec with cephadm using the orchestrator: ++ +---- +sudo cephadm shell -m /tmp/mon.yaml +ceph orch apply -i /mnt/mon.yaml +---- ++ +[NOTE] +The effect of applying the `mon.yaml` spec is to normalize the existng placement +strategy to use `labels` instead of `hosts`. By doing this any node with the `mon` +label is able to host a Ceph mon daemon. +The step above can be executed once, and it shouldn't be repeated in case of +multiple iterations over this procedure when multiple Ceph Mons are migrated. + +. Before moving to the next step, check the status of the {CephCluster} and the +orchestrator daemons list: make sure the three mons are in quorum and listed by +the `ceph orch` command: ++ +---- +# ceph -s + cluster: + id: f6ec3ebe-26f7-56c8-985d-eb974e8e08e3 + health: HEALTH_OK + + services: + mon: 3 daemons, quorum oc0-controller-0,oc0-controller-1,oc0-controller-2 (age 19m) + mgr: oc0-controller-0.xzgtvo(active, since 32m), standbys: oc0-controller-1.mtxohd, oc0-controller-2.ahrgsk + osd: 8 osds: 8 up (since 12m), 8 in (since 18m); 1 remapped pgs + + data: + pools: 1 pools, 1 pgs + objects: 0 objects, 0 B + usage: 43 MiB used, 400 GiB / 400 GiB avail + pgs: 1 active+clean +---- ++ +---- +[ceph: root@oc0-controller-0 /]# ceph orch host ls +HOST ADDR LABELS STATUS +oc0-ceph-0 192.168.24.14 osd +oc0-ceph-1 192.168.24.7 osd +oc0-controller-0 192.168.24.15 _admin mgr mon +oc0-controller-1 192.168.24.23 _admin mgr mon +oc0-controller-2 192.168.24.13 _admin mgr mon +---- + +. On the source node, backup the `/etc/ceph/` directory: this allows, in case +of issues, to have the ability to execute cephadm and get a shell to the ceph +cluster from the source node: ++ +---- +mkdir -p $HOME/ceph_client_backup +sudo cp -R /etc/ceph $HOME/ceph_client_backup +---- + +. Before draining the source node and relocating the IP address of the storage +network to to the target node, fail the mgr if is active on the source node: ++ +---- +ceph mgr fail +---- + +. Drain the source node and start the mon migration. From the cephadm shell, +remove the labels on source node: ++ +---- + for label in mon mgr _admin; do + ceph orch host rm label $label; + done +---- + +. Remove the running mon daemon from the source node: ++ +---- +cephadm shell -- ceph orch daemon rm mon. --force" +---- + +. Run the drain command: ++ +---- +cephadm shell -- ceph drain +---- + +. Remove the source_node host from the {CephCluster}: ++ +---- +cephadm shell -- ceph orch host rm --force" +---- + +- Replace with the hostname of the source node. + +The source node is not part of the cluster anymore, and shouldn't appear in the +Ceph host list when running `cephadm shell -- ceph orch host ls`. +However, a `sudo podman ps` in the source node might list both mon and mgr still +running. + +---- +[root@oc0-controller-1 ~]# sudo podman ps +CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES +ifeval::["{build}" != "downstream"] +5c1ad36472bc quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4 -n mon.oc0-contro... 35 minutes ago Up 35 minutes ago ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-controller-1 +3b14cc7bf4dd quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4 -n mgr.oc0-contro... 35 minutes ago Up 35 minutes ago ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-oc0-controller-1-mtxohd +endif::[] +ifeval::["{build}" == "downstream"] +5c1ad36472bc registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4 -n mon.oc0-contro... 35 minutes ago Up 35 minutes ago ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-controller-1 +3b14cc7bf4dd registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4 -n mgr.oc0-contro... 35 minutes ago Up 35 minutes ago ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-oc0-controller-1-mtxohd +endif::[] +---- + +To cleanup the source node before moving to the next phase, cleanup the existing +containers and remove the cephadm related data from the node. +// fpantano: there's an automated procedure run through cephadm but it's too +// risky. If the user doesn't perform it properly the cluster can be affected. +// We can put a downstream comment to contact the RH support to clean the source +// node up in case of leftovers, and open a bug for cephadm. + +//. ssh into one of the existing Ceph mons (usually controller-1 or controller-2) +. Prepare the target node to host the new mon and add the `mon` label to the +target node: ++ +---- +for label in mon mgr _admin; do + ceph orch host label add $label; done +done +---- + +- Replace with the hostname of the host listed in the {CephCluster} + through the `ceph orch host ls` command. + +At this point the cluster is running with only two mons, but a third mon appears +and will be deployed on the target node. +However, The third mon might be deployed on a different ip address available in +the node, and you need to redeploy it when the ip migration is concluded. +Even though the mon is deployed on the wrong ip address, it's useful keep the +quorum to three and it ensures we do not risk to lose the cluster because two +mons go in split brain. + +. Confirm the cluster has three mons and they're in quorum: ++ +---- +cephadm shell -- ceph -s +cephamd shell -- ceph orch ps | grep -i mon +---- + +It's now possible to migrate the original mon ip address to the target node and +redeploy the existing mon on it. +The following IP address migration procedure assumes that the target nodes have +been originally deployed by {OpenStackPreviousInstaller} and the network configuration +is managed by `os-net-config`. + +// NOTE (fpantano): we need to document the same ip address migration procedure +// w/ an EDPM node that has already been adopted. + +. Get the mon ip address from the existing `/etc/ceph/ceph.conf` (check the `mon_host` +line), for example: ++ +---- +mon_host = [v2:172.17.3.60:3300/0,v1:172.17.3.60:6789/0] [v2:172.17.3.29:3300/0,v1:172.17.3.29:6789/0] [v2:172.17.3.53:3300/0,v1:172.17.3.53:6789/0] +---- + +. Confirm the mon ip address is present on the source node `os-net-config` +configuration located in `/etc/os-net-config`: ++ +---- + +[tripleo-admin@controller-0 ~]$ grep "172.17.3.60" /etc/os-net-config/config.yaml + - ip_netmask: 172.17.3.60/24 +---- + +. Edit the config file `/etc/os-net-config/config.yaml` and remove the +`ip_netmask` line retrieved in the previous step. + +. Save the file and refresh the node network configuration with the following +command: ++ +---- +sudo os-net-config -c /etc/os-net-config/config.yaml +---- + +. Verify, using the `ip` command, that the IP address is not present in the source +node anymore. + +ssh into the target node, for example `cephstorage-0`, and add the ip address +that will be bound to the new mon. + +. On the target node, edit the config file `/etc/os-net-config/config.yaml` and +add the `- ip_netmask: 172.17.3.60` line removed in the source node. + +. Save the file and refresh the node network configuration with the following +command: ++ +---- +sudo os-net-config -c /etc/os-net-config/config.yaml +---- + +. Verify, using the `ip` command, that the IP address is present in the target +node. + +Get the Ceph spec and set the mon daemons to `unmanaged`: + +. Get the ceph mon spec: ++ +---- +ceph orch ls --export mon > mon.yaml +---- + +.Edit the retrieved spec and add the `unamanged: true` keyword: ++ +[source,yaml] +---- +service_type: mon +service_id: mon +placement: + label: mon +unamanged: true +---- + +. Save the spec in /tmp/mon.yaml +. Apply the spec with cephadm using the orchestrator: ++ +---- +sudo cephadm shell -m /tmp/mon.yaml +ceph orch apply -i /mnt/mon.yaml +---- + +The mon daemons are marked as , and it's now possible to redeploy +the existing daemon and bind it to the migrated ip address. + +. Delete the existing mon on the target node: ++ +---- +$ ceph orch daemon add rm mon. --force +---- + +. Redeploy the new mon on the target using the old IP address: ++ +---- +$ ceph orch daemon add mon : +---- + +- Replace with the hostname of the target node enrolled in the Ceph +cluster +- Replace with the ip address of the migrated address + +Get the Ceph spec and set the mon daemons to `unmanaged: false`: + +. Get the ceph mon spec: ++ +---- +ceph orch ls --export mon > mon.yaml +---- + +.Edit the retrieved spec and set the `unamanged` keyword to `false`: ++ +[source,yaml] +---- +service_type: mon +service_id: mon +placement: + label: mon +unamanged: false +---- + +. Save the spec in /tmp/mon.yaml +. Apply the spec with cephadm using the orchestrator: ++ +---- +sudo cephadm shell -m /tmp/mon.yaml +ceph orch apply -i /mnt/mon.yaml +---- + +The new mon runs on the target node with the original IP address. +As last step of the mon migration, you need to refresh the cephadm information +and reconfigure the existing daemons to exchange the map with the updated mon +references. + +. Identify the running `mgr`: ++ +---- +sudo cephadm shell -- ceph -s +---- + +. Refresh the mgr information by force-failing it: ++ +---- +ceph mgr fail +---- + +. Refresh the `OSD` information: ++ +---- +ceph orch reconfig osd.default_drive_group +---- + ++ +Verify that at this point the {CephCluster} cluster is healthy: ++ +---- +[ceph: root@oc0-controller-0 specs]# ceph -s + cluster: + id: f6ec3ebe-26f7-56c8-985d-eb974e8e08e3 + health: HEALTH_OK +... +... +---- + +. Repeat the procedure described in this section for any additional Controller +node hosting a mon until you have migrated all the Ceph Mon daemons to the +target nodes.