Reorganize Ceph assemblies

Ceph assemblies are now better reorganized to follow a simple rule/struct. A main ceph-cluster migration assembly is included in main, and it contains a quick intro and the (ordered) list of procedures (including the cardinality section that is critical here and will be improved in a follow up patch). This way the ceph doc is very easy to access and maintain. There are also fixes to wrong references (e.g. horizon != Ceph dashboard). Signed-off-by: Francesco Pantano <[email protected]>
openstack-k8s-operators · Jun 5, 2024 · 62babe2 · 62babe2
1 parent 48f764f
commit 62babe2
Show file tree

Hide file tree

Showing 10 changed files with 252 additions and 224 deletions.
diff --git a/docs_user/assemblies/assembly_migrating-ceph-cluster.adoc b/docs_user/assemblies/assembly_migrating-ceph-cluster.adoc
@@ -0,0 +1,44 @@
+ifdef::context[:parent-context: {context}]
+
+[id="ceph-migration_{context}"]
+
+= Migrating the {CephCluster} Cluster
+
+:context: migrating-ceph
+
+:toc: left
+:toclevels: 3
+
+ifdef::parent-context[:context: {parent-context}]
+ifndef::parent-context[:!context:]
+
+In the context of data plane adoption, where the {rhos_prev_long}
+({OpenStackShort}) services are redeployed in {OpenShift}, you migrate a
+{OpenStackPreviousInstaller}-deployed {CephCluster} cluster by using a process
+called “externalizing” the {CephCluster} cluster.
+
+There are two deployment topologies that include an internal {CephCluster}
+cluster:
+
+* {OpenStackShort} includes dedicated {CephCluster} nodes to host object
+  storage daemons (OSDs)
+
+* Hyperconverged Infrastructure (HCI), where Compute and Storage services are
+  colocated on hyperconverged nodes
+
+In either scenario, there are some {Ceph} processes that are deployed on
+{OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW),
+Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS
+Ganesha. To migrate your {CephCluster} cluster, you must decommission the
+Controller nodes and move the {Ceph} daemons to a set of target nodes that are
+already part of the {CephCluster} cluster.
+
+include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1]
+
+include::assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1]
+
+include::../modules/proc_migrating-ceph-mds.adoc[leveloffset=+1]
+
+include::assembly_migrating-ceph-rgw.adoc[leveloffset=+1]
+
+include::assembly_migrating-ceph-rbd.adoc[leveloffset=+1]
diff --git a/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc b/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc
@@ -4,30 +4,32 @@
 
 = Migrating the monitoring stack component to new nodes within an existing {Ceph} cluster
 
-In the context of data plane adoption, where the {rhos_prev_long} ({OpenStackShort}) services are
-redeployed in {OpenShift}, a {OpenStackPreviousInstaller}-deployed {CephCluster} cluster will undergo a migration in a process we are calling “externalizing” the {CephCluster} cluster.
-There are two deployment topologies, broadly, that include an “internal” {CephCluster} cluster today: one is where {OpenStackShort} includes dedicated {CephCluster} nodes to host object storage daemons (OSDs), and the other is Hyperconverged Infrastructure (HCI) where Compute nodes
-double up as {CephCluster} nodes. In either scenario, there are some {Ceph} processes that are deployed on {OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS Ganesha.
 The Ceph Dashboard module adds web-based monitoring and administration to the
 Ceph Manager.
-With {OpenStackPreviousInstaller}-deployed {Ceph} this component is enabled as part of the overcloud deploy and it’s composed by:
+With {OpenStackPreviousInstaller}-deployed {Ceph}, this component is enabled as
+part of the overcloud deploy and it is composed of the following:
 
 - Ceph Manager module
 - Grafana
 - Prometheus
 - Alertmanager
 - Node exporter
 
-The Ceph Dashboard containers are included through `tripleo-container-image-prepare` parameters and the high availability relies on `Haproxy` and `Pacemaker` deployed on the {OpenStackShort} front.
-For an external {CephCluster} cluster, high availability is not supported.
-The goal of this procedure is to migrate and relocate the Ceph Monitoring
-components to free Controller nodes.
-
-For this procedure, we assume that we are beginning with a {OpenStackShort} based on {rhos_prev_ver} and a {Ceph} {CephRelease} deployment managed by {OpenStackPreviousInstaller}.
-We assume that:
-
-* {Ceph} has been upgraded to {CephRelease} and is managed by cephadm/orchestrator
-* Both the {Ceph} public and cluster networks are propagated, through{OpenStackPreviousInstaller}, to the target nodes
+The Ceph Dashboard containers are included through
+`tripleo-container-image-prepare` parameters and the high availability relies
+on `Haproxy` and `Pacemaker` deployed on the {OpenStackShort} front. For an
+external {CephCluster} cluster, high availability is not supported. The goal of
+this procedure is to migrate and relocate the Ceph Monitoring components to
+free Controller nodes.
+
+For this procedure, we assume that we are beginning with a {OpenStackShort}
+based on {rhos_prev_ver} and a {Ceph} {CephRelease} deployment managed by
+{OpenStackPreviousInstaller}. We assume that:
+
+* {Ceph} has been upgraded to {CephRelease} and is managed by
+  cephadm/orchestrator
+* Both the {Ceph} public and cluster networks are propagated,
+  through {OpenStackPreviousInstaller}, to the target nodes
 
 include::../modules/proc_completing-prerequisites-for-migrating-ceph-monitoring-stack.adoc[leveloffset=+1]
 

diff --git a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc
@@ -4,22 +4,39 @@
 
 = Migrating Red Hat Ceph Storage RBD to external RHEL nodes
 
-For hyperconverged infrastructure (HCI) or dedicated Storage nodes that are running {Ceph} version 6 or later, you must migrate the daemons that are included in the {rhos_prev_long} control plane into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or dedicated storage nodes.
+For Hyperconverged Infrastructure (HCI) or dedicated Storage nodes that are
+running {Ceph} version 6 or later, you must migrate the daemons that are
+included in the {rhos_prev_long} control plane into the existing external Red
+Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include
+the Compute nodes for an HCI environment or dedicated storage nodes.
 
-To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must meet the following requirements:
+To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must
+meet the following requirements:
 
-* {Ceph} is running version 6 or later and is managed by cephadm/orchestrator.
-* NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based deployment to cephadm. For more information, see xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha cluster].
-* Both the {Ceph} public and cluster networks are propagated, with {OpenStackPreviousInstaller}, to the target nodes.
-* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have been migrated already to the target nodes;
+* {Ceph} is running version 6 or later and is managed by cephadm.
+* NFS Ganesha is migrated from a {OpenStackPreviousInstaller}-based
+  deployment to cephadm. For more information, see
+  xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha
+  cluster].
+* Both the {Ceph} public and cluster networks are propagated, with
+  {OpenStackPreviousInstaller}, to the target nodes.
+* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services are
+  migrated to the target nodes.
 ifeval::["{build}" != "upstream"]
-* The daemons distribution follows the cardinality constraints described in the doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations]
+* The daemons distribution follows the cardinality constraints that are
+  described in link:https://access.redhat.com/articles/1548993[Red Hat Ceph
+  Storage: Supported configurations].
 endif::[]
-* The Ceph cluster is healthy, and the `ceph -s` command returns `HEALTH_OK`
-* The procedure keeps the mon IP addresses by moving them to the {Ceph} nodes
-* Drain the existing Controller nodes
-* Deploy additional monitors to the existing nodes, and promote them as
-_admin nodes that administrators can use to manage the {CephCluster} cluster and perform day 2 operations against it.
+* The {Ceph} cluster is healthy, and the `ceph -s` command returns `HEALTH_OK`.
+
+During the procedure to migrate the Ceph Mon daemons, the following actions
+occur:
+
+* The mon IP addresses are moved to the target {Ceph} nodes.
+* The existing Controller nodes are drained and decommisioned.
+* Additional monitors are deployed to the target nodes, and they are promoted
+  as `_admin` nodes that can be used to manage the {CephCluster} cluster and
+  perform day 2 operations.
 
 include::../modules/proc_migrating-mgr-from-controller-nodes.adoc[leveloffset=+1]
 

diff --git a/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc b/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc
@@ -11,8 +11,6 @@ To migrate Ceph Object Gateway (RGW), your environment must meet the following r
 * {Ceph} is running version 6 or later and is managed by cephadm/orchestrator.
 * An undercloud is still available, and the nodes and networks are managed by {OpenStackPreviousInstaller}.
 
-include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1]
-
 include::../modules/proc_completing-prerequisites-for-migrating-ceph-rgw.adoc[leveloffset=+1]
 
 include::../modules/proc_migrating-the-rgw-backends.adoc[leveloffset=+1]

diff --git a/docs_user/assemblies/ceph_migration.adoc b/docs_user/assemblies/ceph_migration.adoc
diff --git a/docs_user/main.adoc b/docs_user/main.adoc
@@ -24,10 +24,4 @@ include::assemblies/assembly_adopting-the-data-plane.adoc[leveloffset=+1]
 
 include::assemblies/assembly_migrating-the-object-storage-service.adoc[leveloffset=+1]
 
-include::assemblies/assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1]
-
-include::modules/proc_migrating-ceph-mds.adoc[leveloffset=+1]
-
-include::assemblies/assembly_migrating-ceph-rgw.adoc[leveloffset=+1]
-
-include::assemblies/assembly_migrating-ceph-rbd.adoc[leveloffset=+1]
+include::assemblies/assembly_migrating-ceph-cluster.adoc[leveloffset=+1]
diff --git a/docs_user/modules/con_ceph-daemon-cardinality.adoc b/docs_user/modules/con_ceph-daemon-cardinality.adoc
@@ -2,19 +2,19 @@
 
 = {Ceph} daemon cardinality
 
-{Ceph} 6 and later applies strict constraints in the way daemons can be colocated within the same node.
+{Ceph} 6 and later applies strict constraints in the way daemons can be
+colocated within the same node.
 ifeval::["{build}" != "upstream"]
 For more information, see link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations].
 endif::[]
-The resulting topology depends on the available hardware, as well as the amount of {Ceph} services present in the Controller nodes which are going to be retired.
-ifeval::["{build}" != "upstream"]
-For more information about the procedure that is required to migrate the RGW component and keep an HA model using the Ceph ingress daemon, see link:{defaultCephURL}/object_gateway_guide/index#high-availability-for-the-ceph-object-gateway[High availability for the Ceph Object Gateway] in _Object Gateway Guide_.
-endif::[]
-ifeval::["{build}" != "downstream"]
-The following document describes the procedure required to migrate the RGW component (and keep an HA model using the https://docs.ceph.com/en/latest/cephadm/services/rgw/#high-availability-service-for-rgw[Ceph Ingress daemon] in a common {OpenStackPreviousInstaller} scenario where Controller nodes represent the
-https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_cephadm/tasks/rgw.yaml#L26-L30[spec placement] where the service is deployed.
-endif::[]
-As a general rule, the number of services that can be migrated depends on the number of available nodes in the cluster. The following diagrams cover the distribution of the {Ceph} daemons on the {Ceph} nodes where at least three nodes are required in a scenario that sees only RGW and RBD, without the {dashboard_first_ref}:
+The resulting topology depends on the available hardware, as well as the amount
+of {Ceph} services present in the Controller nodes that are going to be
+retired.
+As a general rule, the number of services that can be migrated depends on the
+number of available nodes in the cluster. The following diagrams cover the
+distribution of the {Ceph} daemons on the {Ceph} nodes where at least three
+nodes are required in a scenario that includes only RGW and RBD, without the
+{Ceph} Dashboard:
 
 ----
 |    |                     |             |
@@ -24,7 +24,8 @@ As a general rule, the number of services that can be migrated depends on the nu
 | osd | mon/mgr/crash      | rgw/ingress |
 ----
 
-With the {dashboard}, and without {rhos_component_storage_file_first_ref} at least four nodes are required. The {dashboard} has no failover:
+With the {dashboard}, and without {rhos_component_storage_file_first_ref}, at
+least 4 nodes are required. The {Ceph} dashboard has no failover:
 
 ----
 |     |                     |             |
@@ -35,7 +36,8 @@ With the {dashboard}, and without {rhos_component_storage_file_first_ref} at lea
 | osd | rgw/ingress   | (free)            |
 ----
 
-With the {dashboard} and the {rhos_component_storage_file}, 5 nodes minimum are required, and the {dashboard} has no failover:
+With the {Ceph} dashboard and the {rhos_component_storage_file}, 5 nodes
+minimum are required, and the {Ceph} dashboard has no failover:
 
 ----
 |     |                     |                         |

diff --git a/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc b/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc
@@ -1,73 +1,66 @@
-[id="migrating-mgr-from-controller-nodes_{context}"]
+= Migrating Ceph Manager daemons to {Ceph} nodes
 
-= Migrating Ceph Mgr daemons to {Ceph} nodes
+The following section describes how to move Ceph Manager daemons from the
+{rhos_prev_long} Controller nodes to a set of target nodes. Target nodes might
+be pre-existing {Ceph} nodes, or {OpenStackShort} Compute nodes if {Ceph} is
+deployed by {OpenStackPreviousInstaller} with an HCI topology.
+This procedure assumes that Cephadm and the {Ceph} Orchestrator are the tools
+that drive the Ceph Manager migration. As is done with the other Ceph daemons
+(MDS, Monitoring, and RGW), the procedure uses the Ceph spec to modify the
+placement and reschedule the daemons. Ceph Manager is run in an active/passive
+fashion, and it also provides many modules, including the Ceph orchestrator.
 
-The following section describes how to move Ceph Mgr daemons from the
-OpenStack controller nodes to a set of target nodes. Target nodes might be
-pre-existing {Ceph} nodes, or OpenStack Compute nodes if Ceph is deployed by
-{OpenStackPreviousInstaller} with an HCI topology.
 
 .Prerequisites
 
-Configure the target nodes (CephStorage or ComputeHCI) to have both `storage`
+* Configure the target nodes (CephStorage or ComputeHCI) to have both `storage`
 and `storage_mgmt` networks to ensure that you can use both {Ceph} public and
 cluster networks from the same node. This step requires you to interact with
 {OpenStackPreviousInstaller}. From {rhos_prev_long} {rhos_prev_ver} and later
 you do not have to run a stack update.
 
 .Procedure
 
-This procedure assumes that cephadm and the orchestrator are the tools that
-drive the Ceph Mgr migration. As done with the other Ceph daemons (MDS,
-Monitoring and RGW), the procedure uses the Ceph spec to modify the placement
-and reschedule the daemons. Ceph Mgr is run in an active/passive fashion, and
-it's also responsible to provide many modules, including the orchestrator.
-
-. Before start the migration, ssh into the target node and enable the firewall
-rules required to reach a Mgr service.
-[source,bash]
+. Ssh into the target node and enable the firewall rules that are required to
+  reach a Manager service:
 +
 ----
 dports="6800:7300"
 ssh heat-admin@<target_node> sudo iptables -I INPUT \
     -p tcp --match multiport --dports $dports -j ACCEPT;
 ----
++
+Repeat this step for each `<target_node>`.
 
-[NOTE]
-Repeat the previous action for each target_node.
-
-. Check the rules are properly applied and persist them:
+. Check that the rules are properly applied and persist them:
 +
-[source,bash]
 ----
-sudo iptables-save
-sudo systemctl restart iptables
+$ sudo iptables-save
+$ sudo systemctl restart iptables
 ----
-
-. Prepare the target node to host the new Ceph Mgr daemon, and add the `mgr`
++
+. Prepare the target node to host the new Ceph Manager daemon, and add the `mgr`
 label to the target node:
 +
-[source,bash]
 ----
 ceph orch host label add <target_node> mgr; done
 ----
++
+* Replace `<target_node>` with the hostname of the hosts listed in the {Ceph}
+  through the `ceph orch host ls` command
++
+Repeat the actions described above for each `<target_node> that will host a
+Ceph Manager daemon.
 
-- Replace <target_node> with the hostname of the hosts listed in the {Ceph}
-  through the `ceph orch host ls` command.
-
-Repeat this action for each node that will be host a Ceph Mgr daemon.
-
-Get the Ceph Mgr spec and update the `placement` section to use `label` as the
-main scheduling strategy.
-
-. Get the Ceph Mgr spec:
+. Get the Ceph Manager spec:
 +
 [source,yaml]
 ----
 sudo cephadm shell -- ceph orch ls --export mgr > mgr.yaml
 ----
 
-.Edit the retrieved spec and add the `label: mgr` section:
+. Edit the retrieved spec and add the `label: mgr` section to the `placement`
+  section:
 +
 [source,yaml]
 ----
@@ -77,24 +70,26 @@ placement:
   label: mgr
 ----
 
-. Save the spec in `/tmp/mgr.yaml`
-. Apply the spec with cephadm using the orchestrator:
+. Save the spec in the `/tmp/mgr.yaml` file.
+. Apply the spec with cephadm by using the orchestrator:
 +
 ----
 sudo cephadm shell -m /tmp/mgr.yaml -- ceph orch apply -i /mnt/mgr.yaml
 ----
++
+As a result of this procedure, you see a Ceph Manager daemon count that matches
+the number of hosts where the `mgr` label is added.
 
-According to the numner of nodes where the `mgr` label is added, you will see a
-Ceph Mgr daemon count that matches the number of hosts.
-
-. Verify new Ceph Mgr have been created in the target_nodes:
+. Verify that the new Ceph Manager are created in the target nodes:
 +
 ----
 ceph orch ps | grep -i mgr
 ceph -s
 ----
 +
 [NOTE]
-The procedure does not shrink the Ceph Mgr daemons: the count is grown by the
-number of target nodes, and the xref:migrating-mon-from-controller-nodes[Ceph Mon migration procedure]
-will decommission the stand-by Ceph Mgr instances.
+The procedure does not shrink the Ceph Manager daemons. The count is grown by
+the number of target nodes, and migrating Ceph Monitor daemons to {Ceph} nodes
+decommissions the stand-by Ceph Manager instances. For more information, see
+xref:migrating-mon-from-controller-nodes_migrating-ceph-rbd[Migrating Ceph Monitor
+daemons to {Ceph} nodes].