From 97458fee23c1268f0bbe562617da31251b0ebdf6 Mon Sep 17 00:00:00 2001 From: Francesco Pantano Date: Fri, 17 May 2024 22:45:03 +0200 Subject: [PATCH] Reorganize Ceph assemblies Ceph assemblies are now better reorganized to follow a simple rule/struct. A main ceph-cluster migration assembly is included in main, and it contains a quick intro and the (ordered) list of procedures (including the cardinality section that is critical here and will be improved in a follow up patch). This way the ceph doc is very easy to access and maintain. There are also fixes to wrong references (e.g. horizon != Ceph dashboard). Signed-off-by: Francesco Pantano --- .../assembly_migrating-ceph-cluster.adoc | 35 ++++++++++++++++++ ...embly_migrating-ceph-monitoring-stack.adoc | 4 -- .../assembly_migrating-ceph-rbd.adoc | 37 ++++++++++++++----- .../assembly_migrating-ceph-rgw.adoc | 2 - docs_user/assemblies/ceph_migration.adoc | 15 -------- docs_user/main.adoc | 8 +--- .../modules/con_ceph-daemon-cardinality.adoc | 26 +++++++------ 7 files changed, 77 insertions(+), 50 deletions(-) create mode 100644 docs_user/assemblies/assembly_migrating-ceph-cluster.adoc delete mode 100644 docs_user/assemblies/ceph_migration.adoc diff --git a/docs_user/assemblies/assembly_migrating-ceph-cluster.adoc b/docs_user/assemblies/assembly_migrating-ceph-cluster.adoc new file mode 100644 index 000000000..cd3f42973 --- /dev/null +++ b/docs_user/assemblies/assembly_migrating-ceph-cluster.adoc @@ -0,0 +1,35 @@ +ifdef::context[:parent-context: {context}] + +[id="ceph-migration_{context}"] + += Migrating Red Hat Ceph Storage Cluster + +:context: migrating-ceph + +:toc: left +:toclevels: 3 + +ifdef::parent-context[:context: {parent-context}] +ifndef::parent-context[:!context:] + +In the context of data plane adoption, where the {rhos_prev_long} ({OpenStackShort}) services are +redeployed in {OpenShift}, a {OpenStackPreviousInstaller}-deployed {CephCluster} cluster will undergo +a migration in a process we are calling “externalizing” the {CephCluster} cluster. +There are two deployment topologies, broadly, that include an “internal” {CephCluster} cluster today: +one is where {OpenStackShort} includes dedicated {CephCluster} nodes to host object storage daemons +(OSDs), and the other is Hyperconverged Infrastructure (HCI) where Compute nodes double up as +{CephCluster} nodes. In either scenario, there are some {Ceph} processes that are deployed on {OpenStackShort} +Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata +Server (MDS), Ceph Dashboard, and NFS Ganesha. +This section describes the procedure to decommission Controller nodes and move the Ceph daemons to a +set of target nodes that are already part of the {CephCluster} cluster. + +include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1] + +include::assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1] + +include::../modules/proc_migrating-ceph-mds.adoc[leveloffset=+1] + +include::assembly_migrating-ceph-rgw.adoc[leveloffset=+1] + +include::assembly_migrating-ceph-rbd.adoc[leveloffset=+1] diff --git a/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc b/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc index fad8e444e..5261eee3d 100644 --- a/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc +++ b/docs_user/assemblies/assembly_migrating-ceph-monitoring-stack.adoc @@ -4,10 +4,6 @@ = Migrating the monitoring stack component to new nodes within an existing {Ceph} cluster -In the context of data plane adoption, where the {rhos_prev_long} ({OpenStackShort}) services are -redeployed in {OpenShift}, a {OpenStackPreviousInstaller}-deployed {CephCluster} cluster will undergo a migration in a process we are calling “externalizing” the {CephCluster} cluster. -There are two deployment topologies, broadly, that include an “internal” {CephCluster} cluster today: one is where {OpenStackShort} includes dedicated {CephCluster} nodes to host object storage daemons (OSDs), and the other is Hyperconverged Infrastructure (HCI) where Compute nodes -double up as {CephCluster} nodes. In either scenario, there are some {Ceph} processes that are deployed on {OpenStackShort} Controller nodes: {Ceph} monitors, Ceph Object Gateway (RGW), Rados Block Device (RBD), Ceph Metadata Server (MDS), Ceph Dashboard, and NFS Ganesha. The Ceph Dashboard module adds web-based monitoring and administration to the Ceph Manager. With {OpenStackPreviousInstaller}-deployed {Ceph} this component is enabled as part of the overcloud deploy and it’s composed by: diff --git a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc index 7269d405f..6e39993ea 100644 --- a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc +++ b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc @@ -4,22 +4,39 @@ = Migrating Red Hat Ceph Storage RBD to external RHEL nodes -For hyperconverged infrastructure (HCI) or dedicated Storage nodes that are running {Ceph} version 6 or later, you must migrate the daemons that are included in the {rhos_prev_long} control plane into the existing external Red Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include the Compute nodes for an HCI environment or dedicated storage nodes. +For hyperconverged infrastructure (HCI) or dedicated Storage nodes that are +running {Ceph} version 6 or later, you must migrate the daemons that are +included in the {rhos_prev_long} control plane into the existing external Red +Hat Enterprise Linux (RHEL) nodes. The external RHEL nodes typically include +the Compute nodes for an HCI environment or dedicated storage nodes. -To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must meet the following requirements: +To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must +meet the following requirements: * {Ceph} is running version 6 or later and is managed by cephadm/orchestrator. -* NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based deployment to cephadm. For more information, see xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha cluster]. -* Both the {Ceph} public and cluster networks are propagated, with {OpenStackPreviousInstaller}, to the target nodes. -* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have been migrated already to the target nodes; +* NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based + deployment to cephadm. For more information, see + xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha + cluster]. +* Both the {Ceph} public and cluster networks are propagated, with + {OpenStackPreviousInstaller}, to the target nodes. +* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have + been migrated already to the target nodes; ifeval::["{build}" != "upstream"] -* The daemons distribution follows the cardinality constraints described in the doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations] +* The daemons distribution follows the cardinality constraints described in the + doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: + Supported configurations] endif::[] * The Ceph cluster is healthy, and the `ceph -s` command returns `HEALTH_OK` -* The procedure keeps the mon IP addresses by moving them to the {Ceph} nodes -* Drain the existing Controller nodes -* Deploy additional monitors to the existing nodes, and promote them as -_admin nodes that administrators can use to manage the {CephCluster} cluster and perform day 2 operations against it. + +The high level procedure that migrates the Ceph Mon daemons is based on th +following assumptions: + +* It keeps the mon IP addresses by moving them to the target {Ceph} nodes +* It drains the existing Controller nodes that are supposed to be decommisioned +* It deploys additional monitors to the target nodes, and promote them as +`_admin` nodes that administrators can use to manage the {CephCluster} cluster +and perform day 2 operations. include::../modules/proc_migrating-mgr-from-controller-nodes.adoc[leveloffset=+1] diff --git a/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc b/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc index 3116e242a..4b7cd198f 100644 --- a/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc +++ b/docs_user/assemblies/assembly_migrating-ceph-rgw.adoc @@ -11,8 +11,6 @@ To migrate Ceph Object Gateway (RGW), your environment must meet the following r * {Ceph} is running version 6 or later and is managed by cephadm/orchestrator. * An undercloud is still available, and the nodes and networks are managed by {OpenStackPreviousInstaller}. -include::../modules/con_ceph-daemon-cardinality.adoc[leveloffset=+1] - include::../modules/proc_completing-prerequisites-for-migrating-ceph-rgw.adoc[leveloffset=+1] include::../modules/proc_migrating-the-rgw-backends.adoc[leveloffset=+1] diff --git a/docs_user/assemblies/ceph_migration.adoc b/docs_user/assemblies/ceph_migration.adoc deleted file mode 100644 index e2882c25c..000000000 --- a/docs_user/assemblies/ceph_migration.adoc +++ /dev/null @@ -1,15 +0,0 @@ -ifdef::context[:parent-context: {context}] - -[id="ceph-migration_{context}"] - -= Ceph migration - -:context: ceph-migration - -:toc: left -:toclevels: 3 - -include::../modules/ceph-monitoring_migration.adoc[leveloffset=+1] - -ifdef::parent-context[:context: {parent-context}] -ifndef::parent-context[:!context:] diff --git a/docs_user/main.adoc b/docs_user/main.adoc index 4c7652b04..d633b00e2 100644 --- a/docs_user/main.adoc +++ b/docs_user/main.adoc @@ -24,10 +24,4 @@ include::assemblies/assembly_adopting-the-data-plane.adoc[leveloffset=+1] include::assemblies/assembly_migrating-the-object-storage-service.adoc[leveloffset=+1] -include::assemblies/assembly_migrating-ceph-monitoring-stack.adoc[leveloffset=+1] - -include::modules/proc_migrating-ceph-mds.adoc[leveloffset=+1] - -include::assemblies/assembly_migrating-ceph-rgw.adoc[leveloffset=+1] - -include::assemblies/assembly_migrating-ceph-rbd.adoc[leveloffset=+1] +include::assemblies/assembly_migrating-ceph-cluster.adoc[leveloffset=+1] diff --git a/docs_user/modules/con_ceph-daemon-cardinality.adoc b/docs_user/modules/con_ceph-daemon-cardinality.adoc index 8ed18b3ff..192fb9ebb 100644 --- a/docs_user/modules/con_ceph-daemon-cardinality.adoc +++ b/docs_user/modules/con_ceph-daemon-cardinality.adoc @@ -2,19 +2,19 @@ = {Ceph} daemon cardinality -{Ceph} 6 and later applies strict constraints in the way daemons can be colocated within the same node. +{Ceph} 6 and later applies strict constraints in the way daemons can be +colocated within the same node. ifeval::["{build}" != "upstream"] For more information, see link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations]. endif::[] -The resulting topology depends on the available hardware, as well as the amount of {Ceph} services present in the Controller nodes which are going to be retired. -ifeval::["{build}" != "upstream"] -For more information about the procedure that is required to migrate the RGW component and keep an HA model using the Ceph ingress daemon, see link:{defaultCephURL}/object_gateway_guide/index#high-availability-for-the-ceph-object-gateway[High availability for the Ceph Object Gateway] in _Object Gateway Guide_. -endif::[] -ifeval::["{build}" != "downstream"] -The following document describes the procedure required to migrate the RGW component (and keep an HA model using the https://docs.ceph.com/en/latest/cephadm/services/rgw/#high-availability-service-for-rgw[Ceph Ingress daemon] in a common {OpenStackPreviousInstaller} scenario where Controller nodes represent the -https://github.com/openstack/tripleo-ansible/blob/master/tripleo_ansible/roles/tripleo_cephadm/tasks/rgw.yaml#L26-L30[spec placement] where the service is deployed. -endif::[] -As a general rule, the number of services that can be migrated depends on the number of available nodes in the cluster. The following diagrams cover the distribution of the {Ceph} daemons on the {Ceph} nodes where at least three nodes are required in a scenario that sees only RGW and RBD, without the {dashboard_first_ref}: +The resulting topology depends on the available hardware, as well as the amount +of {Ceph} services present in the Controller nodes which are going to be +retired. +As a general rule, the number of services that can be migrated depends on the +number of available nodes in the cluster. The following diagrams cover the +distribution of the {Ceph} daemons on the {Ceph} nodes where at least three +nodes are required in a scenario that sees only RGW and RBD, without the +{Ceph} Dashboard: ---- | | | | @@ -24,7 +24,8 @@ As a general rule, the number of services that can be migrated depends on the nu | osd | mon/mgr/crash | rgw/ingress | ---- -With the {dashboard}, and without {rhos_component_storage_file_first_ref} at least four nodes are required. The {dashboard} has no failover: +With the {dashboard}, and without {rhos_component_storage_file_first_ref} at +least four nodes are required. The {Ceph} dashboard has no failover: ---- | | | | @@ -35,7 +36,8 @@ With the {dashboard}, and without {rhos_component_storage_file_first_ref} at lea | osd | rgw/ingress | (free) | ---- -With the {dashboard} and the {rhos_component_storage_file}, 5 nodes minimum are required, and the {dashboard} has no failover: +With the {Ceph} dashboard and the {rhos_component_storage_file}, 5 nodes +minimum are required, and the {Ceph} dashboard has no failover: ---- | | | |