From a546c9cd96da425c44a030d2202bcd9e4117b428 Mon Sep 17 00:00:00 2001 From: Francesco Pantano Date: Fri, 17 May 2024 22:45:03 +0200 Subject: [PATCH] Reorganize Ceph assemblies Ceph assemblies are now better reorganized to follow a simple rule/struct. A main ceph-cluster migration assembly is included in main, and it contains a quick intro and the (ordered) list of procedures (including the cardinality section that is critical here and will be improved in a follow up patch). This way the ceph doc is very easy to access and maintain. There are also fixes to wrong references (e.g. horizon != Ceph dashboard). Signed-off-by: Francesco Pantano --- .../assembly_migrating-ceph-cluster.adoc | 44 ++++ ...embly_migrating-ceph-monitoring-stack.adoc | 32 +-- .../assembly_migrating-ceph-rbd.adoc | 41 +++- .../assembly_migrating-ceph-rgw.adoc | 2 - docs_user/assemblies/ceph_migration.adoc | 15 -- docs_user/main.adoc | 8 +- .../modules/con_ceph-daemon-cardinality.adoc | 26 ++- ...c_migrating-mgr-from-controller-nodes.adoc | 80 +++---- ...c_migrating-mon-from-controller-nodes.adoc | 216 +++++++++--------- 9 files changed, 247 insertions(+), 217 deletions(-) create mode 100644 docs_user/assemblies/assembly_migrating-ceph-cluster.adoc delete mode 100644 docs_user/assemblies/ceph_migration.adoc diff --git a/docs_user/assemblies/assembly_migrating-ceph-cluster.adoc b/docs_user/assemblies/assembly_migrating-ceph-cluster.adoc new file mode 100644 index 000000000..5e5708396 --- /dev/null +++ b/docs_user/assemblies/assembly_migrating-ceph-cluster.adoc @@ -0,0 +1,44 @@ +ifdef::context[:parent-context: {context}] + +[id="ceph-migration_{context}"] + += Migrating the {CephCluster} Cluster + +:context: migrating-ceph + +:toc: left +:toclevels: 3 + +ifdef::parent-context[:context: {parent-context}] +ifndef::parent-context[:!context:] + +In the context of data plane adoption, where the {rhos_prev_long} +({OpenStackShort}) services are redeployed in {OpenShift}, you migrate a +{OpenStackPreviousInstaller}-deployed {CephCluster} cluster by using a process +called “externalizing” the {CephCluster} cluster. + +There are two deployment topologies that include an internal {CephCluster} +cluster: + +* {OpenStackShort} includes dedicated {CephCluster} nodes to host object + storage daemons (OSDs) + +* Hyperconverged Infrastructure (HCI), where Compute and Storage services are + colocated on hyperconverged nodes + +In either scenario, there are some {Ceph} processes that are deployed on +{OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW), +Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS +Ganesha. To migrate your {CephCluster} cluster, you must decommission the +Controller nodes and move the {Ceph} daemons to a set of target nodes that are +already part of the {CephCluster} cluster. + +include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1] + +include::assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1] + +include::../modules/proc_migrating-ceph-mds.adoc[leveloffset=+1] + +include::assembly_migrating-ceph-rgw.adoc[leveloffset=+1] + +include::assembly_migrating-ceph-rbd.adoc[leveloffset=+1] diff --git a/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc b/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc index fad8e444e..7ea9e697e 100644 --- a/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc +++ b/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc @@ -4,13 +4,10 @@ = Migrating the monitoring stack component to new nodes within an existing {Ceph} cluster -In the context of data plane adoption, where the {rhos_prev_long} ({OpenStackShort}) services are -redeployed in {OpenShift}, a {OpenStackPreviousInstaller}-deployed {CephCluster} cluster will undergo a migration in a process we are calling “externalizing” the {CephCluster} cluster. -There are two deployment topologies, broadly, that include an “internal” {CephCluster} cluster today: one is where {OpenStackShort} includes dedicated {CephCluster} nodes to host object storage daemons (OSDs), and the other is Hyperconverged Infrastructure (HCI) where Compute nodes -double up as {CephCluster} nodes. In either scenario, there are some {Ceph} processes that are deployed on {OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS Ganesha. The Ceph Dashboard module adds web-based monitoring and administration to the Ceph Manager. -With {OpenStackPreviousInstaller}-deployed {Ceph} this component is enabled as part of the overcloud deploy and it’s composed by: +With {OpenStackPreviousInstaller}-deployed {Ceph}, this component is enabled as +part of the overcloud deploy and it is composed of the following: - Ceph Manager module - Grafana @@ -18,16 +15,21 @@ With {OpenStackPreviousInstaller}-deployed {Ceph} this component is enabled as p - Alertmanager - Node exporter -The Ceph Dashboard containers are included through `tripleo-container-image-prepare` parameters and the high availability relies on `Haproxy` and `Pacemaker` deployed on the {OpenStackShort} front. -For an external {CephCluster} cluster, high availability is not supported. -The goal of this procedure is to migrate and relocate the Ceph Monitoring -components to free Controller nodes. - -For this procedure, we assume that we are beginning with a {OpenStackShort} based on {rhos_prev_ver} and a {Ceph} {CephRelease} deployment managed by {OpenStackPreviousInstaller}. -We assume that: - -* {Ceph} has been upgraded to {CephRelease} and is managed by cephadm/orchestrator -* Both the {Ceph} public and cluster networks are propagated, through{OpenStackPreviousInstaller}, to the target nodes +The Ceph Dashboard containers are included through +`tripleo-container-image-prepare` parameters and the high availability relies +on `Haproxy` and `Pacemaker` deployed on the {OpenStackShort} front. For an +external {CephCluster} cluster, high availability is not supported. The goal of +this procedure is to migrate and relocate the Ceph Monitoring components to +free Controller nodes. + +For this procedure, we assume that we are beginning with a {OpenStackShort} +based on {rhos_prev_ver} and a {Ceph} {CephRelease} deployment managed by +{OpenStackPreviousInstaller}. We assume that: + +* {Ceph} has been upgraded to {CephRelease} and is managed by + cephadm/orchestrator +* Both the {Ceph} public and cluster networks are propagated, + through{OpenStackPreviousInstaller}, to the target nodes include::../modules/proc_completing-prerequisites-for-migrating-ceph-monitoring-stack.adoc[leveloffset=+1] diff --git a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc index 7269d405f..b273d1348 100644 --- a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc +++ b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc @@ -4,22 +4,39 @@ = Migrating Red Hat Ceph Storage RBD to external RHEL nodes -For hyperconverged infrastructure (HCI) or dedicated Storage nodes that are running {Ceph} version 6 or later, you must migrate the daemons that are included in the {rhos_prev_long} control plane into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or dedicated storage nodes. +For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes that are +running {Ceph} version 6 or later, you must migrate the daemons that are +included in the {rhos_prev_long} control plane into the existing external Red +Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include +the Compute nodes for an HCI environment or dedicated storage nodes. -To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must meet the following requirements: +To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must +meet the following requirements: -* {Ceph} is running version 6 or later and is managed by cephadm/orchestrator. -* NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based deployment to cephadm. For more information, see xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha cluster]. -* Both the {Ceph} public and cluster networks are propagated, with {OpenStackPreviousInstaller}, to the target nodes. -* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have been migrated already to the target nodes; +* {Ceph} is running version 6 or later and is managed by Cephadm +* NFS Ganesha is migrated from a {OpenStackPreviousInstaller}-based + deployment to cephadm. For more information, see + xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha + cluster]. +* Both the {Ceph} public and cluster networks are propagated, with + {OpenStackPreviousInstaller}, to the target nodes. +* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services are + migrated to the target nodes; ifeval::["{build}" != "upstream"] -* The daemons distribution follows the cardinality constraints described in the doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations] +* The daemons distribution follows the cardinality constraints that are + described in link:https://access.redhat.com/articles/1548993[Red Hat Ceph + Storage: Supported configurations] endif::[] -* The Ceph cluster is healthy, and the `ceph -s` command returns `HEALTH_OK` -* The procedure keeps the mon IP addresses by moving them to the {Ceph} nodes -* Drain the existing Controller nodes -* Deploy additional monitors to the existing nodes, and promote them as -_admin nodes that administrators can use to manage the {CephCluster} cluster and perform day 2 operations against it. +* The {Ceph} cluster is healthy, and the `ceph -s` command returns `HEALTH_OK` + +During the procedure to migrate the Ceph Mon daemons, the following actions +occur: + +* the mon IP addresses are moved to the target {Ceph} nodes +* the existing Controller nodes are drained and decommisioned +* additional monitors are deployed to the target nodes, and they are promoted + as `_admin` nodes that can be used to manage the {CephCluster} cluster and + perform day 2 operations. include::../modules/proc_migrating-mgr-from-controller-nodes.adoc[leveloffset=+1] diff --git a/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc b/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc index 3116e242a..4b7cd198f 100644 --- a/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc +++ b/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc @@ -11,8 +11,6 @@ To migrate Ceph Object Gateway (RGW), your environment must meet the following r * {Ceph} is running version 6 or later and is managed by cephadm/orchestrator. * An undercloud is still available, and the nodes and networks are managed by {OpenStackPreviousInstaller}. -include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1] - include::../modules/proc_completing-prerequisites-for-migrating-ceph-rgw.adoc[leveloffset=+1] include::../modules/proc_migrating-the-rgw-backends.adoc[leveloffset=+1] diff --git a/docs_user/assemblies/ceph_migration.adoc b/docs_user/assemblies/ceph_migration.adoc deleted file mode 100644 index e2882c25c..000000000 --- a/docs_user/assemblies/ceph_migration.adoc +++ /dev/null @@ -1,15 +0,0 @@ -ifdef::context[:parent-context: {context}] - -[id="ceph-migration_{context}"] - -= Ceph migration - -:context: ceph-migration - -:toc: left -:toclevels: 3 - -include::../modules/ceph-monitoring_migration.adoc[leveloffset=+1] - -ifdef::parent-context[:context: {parent-context}] -ifndef::parent-context[:!context:] diff --git a/docs_user/main.adoc b/docs_user/main.adoc index 4c7652b04..d633b00e2 100644 --- a/docs_user/main.adoc +++ b/docs_user/main.adoc @@ -24,10 +24,4 @@ include::assemblies/assembly_adopting-the-data-plane.adoc[leveloffset=+1] include::assemblies/assembly_migrating-the-object-storage-service.adoc[leveloffset=+1] -include::assemblies/assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1] - -include::modules/proc_migrating-ceph-mds.adoc[leveloffset=+1] - -include::assemblies/assembly_migrating-ceph-rgw.adoc[leveloffset=+1] - -include::assemblies/assembly_migrating-ceph-rbd.adoc[leveloffset=+1] +include::assemblies/assembly_migrating-ceph-cluster.adoc[leveloffset=+1] diff --git a/docs_user/modules/con_ceph-daemon-cardinality.adoc b/docs_user/modules/con_ceph-daemon-cardinality.adoc index 8ed18b3ff..50aaa2afe 100644 --- a/docs_user/modules/con_ceph-daemon-cardinality.adoc +++ b/docs_user/modules/con_ceph-daemon-cardinality.adoc @@ -2,19 +2,19 @@ = {Ceph} daemon cardinality -{Ceph} 6 and later applies strict constraints in the way daemons can be colocated within the same node. +{Ceph} 6 and later applies strict constraints in the way daemons can be +colocated within the same node. ifeval::["{build}" != "upstream"] For more information, see link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations]. endif::[] -The resulting topology depends on the available hardware, as well as the amount of {Ceph} services present in the Controller nodes which are going to be retired. -ifeval::["{build}" != "upstream"] -For more information about the procedure that is required to migrate the RGW component and keep an HA model using the Ceph ingress daemon, see link:{defaultCephURL}/object_gateway_guide/index#high-availability-for-the-ceph-object-gateway[High availability for the Ceph Object Gateway] in _Object Gateway Guide_. -endif::[] -ifeval::["{build}" != "downstream"] -The following document describes the procedure required to migrate the RGW component (and keep an HA model using the https://docs.ceph.com/en/latest/cephadm/services/rgw/#high-availability-service-for-rgw[Ceph Ingress daemon] in a common {OpenStackPreviousInstaller} scenario where Controller nodes represent the -https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_cephadm/tasks/rgw.yaml#L26-L30[spec placement] where the service is deployed. -endif::[] -As a general rule, the number of services that can be migrated depends on the number of available nodes in the cluster. The following diagrams cover the distribution of the {Ceph} daemons on the {Ceph} nodes where at least three nodes are required in a scenario that sees only RGW and RBD, without the {dashboard_first_ref}: +The resulting topology depends on the available hardware, as well as the amount +of {Ceph} services present in the Controller nodes that are going to be +retired. +As a general rule, the number of services that can be migrated depends on the +number of available nodes in the cluster. The following diagrams cover the +distribution of the {Ceph} daemons on the {Ceph} nodes where at least three +nodes are required in a scenario that includes only RGW and RBD, without the +{Ceph} Dashboard: ---- | | | | @@ -24,7 +24,8 @@ As a general rule, the number of services that can be migrated depends on the nu | osd | mon/mgr/crash | rgw/ingress | ---- -With the {dashboard}, and without {rhos_component_storage_file_first_ref} at least four nodes are required. The {dashboard} has no failover: +With the {dashboard}, and without {rhos_component_storage_file_first_ref}, at +least 4 nodes are required. The {Ceph} dashboard has no failover: ---- | | | | @@ -35,7 +36,8 @@ With the {dashboard}, and without {rhos_component_storage_file_first_ref} at lea | osd | rgw/ingress | (free) | ---- -With the {dashboard} and the {rhos_component_storage_file}, 5 nodes minimum are required, and the {dashboard} has no failover: +With the {Ceph} dashboard and the {rhos_component_storage_file}, 5 nodes +minimum are required, and the {Ceph} dashboard has no failover: ---- | | | | diff --git a/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc b/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc index 0941d623b..bcdf85a32 100644 --- a/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc +++ b/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc @@ -1,11 +1,16 @@ -[id="migrating-mgr-from-controller-nodes_{context}"] += Migrating Ceph Manager daemons to {Ceph} nodes -= Migrating Ceph Mgr daemons to {Ceph} nodes +The following section describes how to move Ceph Manager daemons from the +{rhos_prev_long} Controller nodes to a set of target nodes. Target nodes might +be pre-existing {Ceph} nodes, or {OpenStackShort} Compute nodes if {Ceph} is +deployed by {OpenStackPreviousInstaller} with an HCI topology. +This procedure assumes that Cephadm and the {Ceph} Orchestrator are the tools +that drive the Ceph Manager migration. As done with the other Ceph daemons +(MDS, Monitoring and RGW), the procedure uses the Ceph spec to modify the +placement and reschedule the daemons. Ceph Manager is run in an active/passive +fashion, and it's also responsible to provide many modules, including the +orchestrator. -The following section describes how to move Ceph Mgr daemons from the -OpenStack controller nodes to a set of target nodes. Target nodes might be -pre-existing {Ceph} nodes, or OpenStack Compute nodes if Ceph is deployed by -{OpenStackPreviousInstaller} with an HCI topology. .Prerequisites @@ -17,57 +22,46 @@ you do not have to run a stack update. .Procedure -This procedure assumes that cephadm and the orchestrator are the tools that -drive the Ceph Mgr migration. As done with the other Ceph daemons (MDS, -Monitoring and RGW), the procedure uses the Ceph spec to modify the placement -and reschedule the daemons. Ceph Mgr is run in an active/passive fashion, and -it's also responsible to provide many modules, including the orchestrator. - -. Before start the migration, ssh into the target node and enable the firewall -rules required to reach a Mgr service. -[source,bash] +. Ssh into the target node and enable the firewall rules that are required to + reach a Manager service: + ---- dports="6800:7300" ssh heat-admin@ sudo iptables -I INPUT \ -p tcp --match multiport --dports $dports -j ACCEPT; ---- ++ +Repeat this step for each ``. -[NOTE] -Repeat the previous action for each target_node. - -. Check the rules are properly applied and persist them: +. Check that the rules are properly applied and persist them: + -[source,bash] ---- -sudo iptables-save -sudo systemctl restart iptables +$ sudo iptables-save +$ sudo systemctl restart iptables ---- - -. Prepare the target node to host the new Ceph Mgr daemon, and add the `mgr` ++ +. Prepare the target node to host the new Ceph Manager daemon, and add the `mgr` label to the target node: + -[source,bash] ---- ceph orch host label add mgr; done ---- ++ +* Replace `` with the hostname of the hosts listed in the {Ceph} + through the `ceph orch host ls` command -- Replace with the hostname of the hosts listed in the {Ceph} - through the `ceph orch host ls` command. - -Repeat this action for each node that will be host a Ceph Mgr daemon. - -Get the Ceph Mgr spec and update the `placement` section to use `label` as the -main scheduling strategy. +Repeat the actions described above for each ` that will host a +Ceph Manager daemon. -. Get the Ceph Mgr spec: +. Get the Ceph Manager spec: + [source,yaml] ---- sudo cephadm shell -- ceph orch ls --export mgr > mgr.yaml ---- -.Edit the retrieved spec and add the `label: mgr` section: +. Edit the retrieved spec and add the `label: mgr` section to the `placement` + section: + [source,yaml] ---- @@ -77,17 +71,17 @@ placement: label: mgr ---- -. Save the spec in `/tmp/mgr.yaml` -. Apply the spec with cephadm using the orchestrator: +. Save the spec in `/tmp/mgr.yaml` file. +. Apply the spec with cephadm by using the orchestrator: + ---- sudo cephadm shell -m /tmp/mgr.yaml -- ceph orch apply -i /mnt/mgr.yaml ---- -According to the numner of nodes where the `mgr` label is added, you will see a -Ceph Mgr daemon count that matches the number of hosts. +As a result of this procedure, you see a Ceph Manager daemon count that matches +the number of hosts where the `mgr` label is added. -. Verify new Ceph Mgr have been created in the target_nodes: +. Verify that the new Ceph Manager are created in the target nodes: + ---- ceph orch ps | grep -i mgr @@ -95,6 +89,8 @@ ceph -s ---- + [NOTE] -The procedure does not shrink the Ceph Mgr daemons: the count is grown by the -number of target nodes, and the xref:migrating-mon-from-controller-nodes[Ceph Mon migration procedure] -will decommission the stand-by Ceph Mgr instances. +The procedure does not shrink the Ceph Manager daemons. The count is grown by +the number of target nodes, and migrating Ceph Monitor daemons to {Ceph} nodes +decommissions the stand-by Ceph Manager instances. For more information, see +xref:migrating-mon-from-controller-nodes_migrating-ceph-rbd[Migrating Ceph Monitor +daemons to {Ceph} nodes]. diff --git a/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc b/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc index ff768b33f..fc412f954 100644 --- a/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc +++ b/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc @@ -3,9 +3,13 @@ = Migrating Ceph Monitor daemons to {Ceph} nodes The following section describes how to move Ceph Monitor daemons from the -OpenStack controller nodes to a set of target nodes. Target nodes might be -pre-existing {Ceph} nodes, or OpenStack Compute nodes if Ceph is deployed by -{OpenStackPreviousInstaller} with an HCI topology. +{rhos_prev_long} Controller nodes to a set of target nodes. Target nodes might +be pre-existing {Ceph} nodes, or {OpenStackShort} Compute nodes if {Ceph} is +deployed by {OpenStackPreviousInstaller} with an HCI topology. +This procedure assumes that some of the steps are run on the source node that +we want to decommission, while other steps are run on the target node that is +supposed to host the redeployed daemon. + .Prerequisites @@ -17,7 +21,7 @@ you do not have to run a stack update. However, there are commands that you must perform to run `os-net-config` on the bare metal node and configure additional networks. -.. If target nodes are `CephStorage`, ensure that the network is defined in the +. If target nodes are `CephStorage`, ensure that the network is defined in the `metalsmith.yaml` for the CephStorageNodes: + [source,yaml] @@ -41,15 +45,15 @@ additional networks. template: templates/single_nic_vlans/single_nic_vlans_storage.j2 ---- -.. Run the following command: +. Run the provisioning command: + ---- -openstack overcloud node provision \ +$ openstack overcloud node provision \ -o overcloud-baremetal-deployed-0.yaml --stack overcloud-0 \ --network-config -y --concurrency 2 /home/stack/metalsmith-0.yam ---- -.. Verify that the storage network is configured on the target nodes: +. Verify that the storage network is configured on the target nodes: + ---- (undercloud) [stack@undercloud ~]$ ssh heat-admin@192.168.24.14 ip -o -4 a @@ -62,34 +66,30 @@ openstack overcloud node provision \ .Procedure -This procedure assumes that some of the steps are run on the source node that -we want to decommission, while other steps are run on the target node that is -supposed to host the redeployed daemon. - -. Before start the migration, ssh into the target node and enable the firewall -rules required to reach a Mon service: +. Ssh into the target node and enable the firewall rules that are required to + reach a Mon service: + ---- -for port in 3300 6789; { +$ for port in 3300 6789; { ssh heat-admin@ sudo iptables -I INPUT \ -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \ -j ACCEPT; } ---- ++ +* Replace `` with the hostname of the node that is supposed to + host the new mon. -- Replace with the hostname of the node that is supposed to host -the new mon - -. Check the rules are properly applied and persist them: +. Check that the rules are properly applied and persist them: + ---- sudo iptables-save sudo systemctl restart iptables ---- -. To migrate the existing Mons to the target {Ceph} nodes, the first step is to -create the following {Ceph} spec from the first mon (or the first controller) -and modify the placement based on the appropriate label. +. To migrate the existing Mons to the target {Ceph} nodes, create the following + {Ceph} spec from the first mon (or the first Controller node) and modify the + placement based on the appropriate label. + [source,yaml] ---- @@ -99,24 +99,24 @@ placement: label: mon ---- -. Save the spec in /tmp/mon.yaml -. Apply the spec with cephadm using the orchestrator: +. Save the spec in `/tmp/mon.yaml` file. +. Apply the spec with cephadm by using the orchestrator: + ---- -sudo cephadm shell -m /tmp/mon.yaml -ceph orch apply -i /mnt/mon.yaml +$ sudo cephadm shell -m /tmp/mon.yaml +$ ceph orch apply -i /mnt/mon.yaml ---- + [NOTE] -The effect of applying the `mon.yaml` spec is to normalize the existng placement -strategy to use `labels` instead of `hosts`. By doing this any node with the `mon` -label is able to host a Ceph mon daemon. -The step above can be executed once, and it shouldn't be repeated in case of -multiple iterations over this procedure when multiple Ceph Mons are migrated. +Applying the `mon.yaml` spec allows the existing strategy to use `labels` +instead of `hosts`. As a result, any node with the `mon` label can host a Ceph +mon daemon. +Execute this step once to avoid multiple iterations when multiple Ceph Mons are +migrated. -. Before moving to the next step, check the status of the {CephCluster} and the -orchestrator daemons list: make sure the three mons are in quorum and listed by -the `ceph orch` command: +. Check the status of the {CephCluster} and the Ceph orchestrator daemons list. + Make sure that the three mons are in quorum and listed by the `ceph orch` + command: + ---- # ceph -s @@ -146,55 +146,57 @@ oc0-controller-1 192.168.24.23 _admin mgr mon oc0-controller-2 192.168.24.13 _admin mgr mon ---- -. On the source node, backup the `/etc/ceph/` directory: this allows, in case -of issues, to have the ability to execute cephadm and get a shell to the ceph -cluster from the source node: +. On the source node, back up the `/etc/ceph/` directory. The backup allows you + to execute cephadm and get a shell to the {Ceph} cluster from the source node: + ---- -mkdir -p $HOME/ceph_client_backup -sudo cp -R /etc/ceph $HOME/ceph_client_backup +$ mkdir -p $HOME/ceph_client_backup +$ sudo cp -R /etc/ceph $HOME/ceph_client_backup ---- . Before draining the source node and relocating the IP address of the storage -network to to the target node, fail the mgr if is active on the source node: + network to the target node, fail the ceph-mgr if it is active on the + source node: + ---- -ceph mgr fail +$ ceph mgr fail ---- . Drain the source node and start the mon migration. From the cephadm shell, -remove the labels on source node: + remove the labels on the source node: + ---- - for label in mon mgr _admin; do - ceph orch host rm label $label; - done +for label in mon mgr _admin; do + ceph orch host rm label $label; +done ---- . Remove the running mon daemon from the source node: + ---- -cephadm shell -- ceph orch daemon rm mon. --force" +$ cephadm shell -- ceph orch daemon rm mon. --force" ---- . Run the drain command: + ---- -cephadm shell -- ceph drain +$ cephadm shell -- ceph drain ---- -. Remove the source_node host from the {CephCluster}: +. Remove the `` host from the {CephCluster}: + ---- -cephadm shell -- ceph orch host rm --force" +$ cephadm shell -- ceph orch host rm --force" ---- ++ +* Replace `` with the hostname of the source node. ++ -- Replace with the hostname of the source node. - -The source node is not part of the cluster anymore, and shouldn't appear in the -Ceph host list when running `cephadm shell -- ceph orch host ls`. -However, a `sudo podman ps` in the source node might list both mon and mgr still -running. +[Note] +The source node is not part of the cluster anymore, and should not appear in +the {Ceph} host list when `cephadm shell -- ceph orch host ls` is run. +However, a `sudo podman ps` in the `` might list both mon and mgr +still up and running. ---- [root@oc0-controller-1 ~]# sudo podman ps @@ -225,10 +227,11 @@ for label in mon mgr _admin; do ceph orch host label add $label; done done ---- - -- Replace with the hostname of the host listed in the {CephCluster} ++ +* Replace with the hostname of the host listed in the {CephCluster} through the `ceph orch host ls` command. +[Note] At this point the cluster is running with only two mons, but a third mon appears and will be deployed on the target node. However, The third mon might be deployed on a different ip address available in @@ -237,14 +240,14 @@ Even though the mon is deployed on the wrong ip address, it's useful keep the quorum to three and it ensures we do not risk to lose the cluster because two mons go in split brain. -. Confirm the cluster has three mons and they're in quorum: +. Confirm that the cluster has three mons and they are in quorum: + ---- -cephadm shell -- ceph -s -cephamd shell -- ceph orch ps | grep -i mon +$ cephadm shell -- ceph -s +$ cephadm shell -- ceph orch ps | grep -i mon ---- -It's now possible to migrate the original mon ip address to the target node and +It is now possible to migrate the original mon IP address to the target node and redeploy the existing mon on it. The following IP address migration procedure assumes that the target nodes have been originally deployed by {OpenStackPreviousInstaller} and the network configuration @@ -260,7 +263,7 @@ line), for example: mon_host = [v2:172.17.3.60:3300/0,v1:172.17.3.60:6789/0] [v2:172.17.3.29:3300/0,v1:172.17.3.29:6789/0] [v2:172.17.3.53:3300/0,v1:172.17.3.53:6789/0] ---- -. Confirm the mon ip address is present on the source node `os-net-config` +. Confirm that the mon ip address is present on the source node `os-net-config` configuration located in `/etc/os-net-config`: + ---- @@ -269,44 +272,39 @@ configuration located in `/etc/os-net-config`: - ip_netmask: 172.17.3.60/24 ---- -. Edit the config file `/etc/os-net-config/config.yaml` and remove the -`ip_netmask` line retrieved in the previous step. +. Edit `/etc/os-net-config/config.yaml` and remove the `ip_netmask` line. -. Save the file and refresh the node network configuration with the following -command: +. Save the file and refresh the node network configuration: + ---- -sudo os-net-config -c /etc/os-net-config/config.yaml +$ sudo os-net-config -c /etc/os-net-config/config.yaml ---- . Verify, using the `ip` command, that the IP address is not present in the source node anymore. -ssh into the target node, for example `cephstorage-0`, and add the ip address -that will be bound to the new mon. +. Ssh into the target node, for example `cephstorage-0`, and add the IP address + for the new mon. -. On the target node, edit the config file `/etc/os-net-config/config.yaml` and -add the `- ip_netmask: 172.17.3.60` line removed in the source node. +. On the target node, edit `/etc/os-net-config/config.yaml` and +add the `- ip_netmask: 172.17.3.60` line that you removed in the source node. -. Save the file and refresh the node network configuration with the following -command: +. Save the file and refresh the node network configuration: + ---- -sudo os-net-config -c /etc/os-net-config/config.yaml +$ sudo os-net-config -c /etc/os-net-config/config.yaml ---- . Verify, using the `ip` command, that the IP address is present in the target node. -Get the Ceph spec and set the mon daemons to `unmanaged`: - . Get the ceph mon spec: + ---- ceph orch ls --export mon > mon.yaml ---- -.Edit the retrieved spec and add the `unamanged: true` keyword: +. Edit the retrieved spec and add the `unmanaged: true` keyword: + [source,yaml] ---- @@ -314,45 +312,44 @@ service_type: mon service_id: mon placement: label: mon -unamanged: true +unmanaged: true ---- -. Save the spec in /tmp/mon.yaml -. Apply the spec with cephadm using the orchestrator: +. Save the spec in the `/tmp/mon.yaml` file +. Apply the spec with cephadm by using the orchestrator: + ---- -sudo cephadm shell -m /tmp/mon.yaml -ceph orch apply -i /mnt/mon.yaml +$ sudo cephadm shell -m /tmp/mon.yaml +$ ceph orch apply -i /mnt/mon.yaml ---- - -The mon daemons are marked as , and it's now possible to redeploy -the existing daemon and bind it to the migrated ip address. ++ +The mon daemons are marked as ``, and it is now possible to redeploy +the existing daemon and bind it to the migrated IP address. . Delete the existing mon on the target node: + ---- $ ceph orch daemon add rm mon. --force ---- - ++ . Redeploy the new mon on the target using the old IP address: + ---- -$ ceph orch daemon add mon : +$ ceph orch daemon add mon : ---- ++ +* Replace `` with the hostname of the target node enrolled in the + {Ceph} cluster. +* Replace `` with the ip address of the migrated address. -- Replace with the hostname of the target node enrolled in the Ceph -cluster -- Replace with the ip address of the migrated address - -Get the Ceph spec and set the mon daemons to `unmanaged: false`: . Get the ceph mon spec: + ---- -ceph orch ls --export mon > mon.yaml +$ ceph orch ls --export mon > mon.yaml ---- -.Edit the retrieved spec and set the `unamanged` keyword to `false`: +. Edit the retrieved spec and set the `unmanaged` keyword to `false`: + [source,yaml] ---- @@ -360,42 +357,38 @@ service_type: mon service_id: mon placement: label: mon -unamanged: false +unmanaged: false ---- -. Save the spec in /tmp/mon.yaml -. Apply the spec with cephadm using the orchestrator: +. Save the spec in `/tmp/mon.yaml` file. +. Apply the spec with cephadm by using the Ceph Orchestrator: + ---- -sudo cephadm shell -m /tmp/mon.yaml -ceph orch apply -i /mnt/mon.yaml +$ sudo cephadm shell -m /tmp/mon.yaml +$ ceph orch apply -i /mnt/mon.yaml ---- - ++ The new mon runs on the target node with the original IP address. -As last step of the mon migration, you need to refresh the cephadm information -and reconfigure the existing daemons to exchange the map with the updated mon -references. . Identify the running `mgr`: + ---- -sudo cephadm shell -- ceph -s +$ sudo cephadm shell -- ceph -s ---- - ++ . Refresh the mgr information by force-failing it: + ---- -ceph mgr fail +$ ceph mgr fail ---- - ++ . Refresh the `OSD` information: + ---- -ceph orch reconfig osd.default_drive_group +$ ceph orch reconfig osd.default_drive_group ---- - + -Verify that at this point the {CephCluster} cluster is healthy: +Verify the {CephCluster} cluster is healthy: + ---- [ceph: root@oc0-controller-0 specs]# ceph -s @@ -406,6 +399,5 @@ Verify that at this point the {CephCluster} cluster is healthy: ... ---- -. Repeat the procedure described in this section for any additional Controller -node hosting a mon until you have migrated all the Ceph Mon daemons to the -target nodes. +. Repeat this procedure for any additional Controller node that hosts a mon + until you have migrated all the Ceph Mon daemons to the target nodes.