From a915f2a6d80e449e4db3a83db35f712454ced945 Mon Sep 17 00:00:00 2001
From: Francesco Pantano <fpantano@redhat.com>
Date: Mon, 13 May 2024 17:21:13 +0200
Subject: [PATCH] Rework Ceph RBD migration documentation

This patch represents a rework of the current RBD documentation to move
it from a POC to a procedure that we can test in CI.
In particular:
- Split the procedure RBD procedure between Ceph Mgr and Ceph Mons
  migration
- Rework both MGR and Mon docs to be more similar to procedures that
  the user should follow

Signed-off-by: Francesco Pantano <fpantano@redhat.com>
---
 .../assembly_migrating-ceph-rbd.adoc          |  13 +-
 ...c_migrating-mgr-from-controller-nodes.adoc |  95 ++++
 ...ing-mon-and-mgr-from-controller-nodes.adoc | 376 ----------------
 ...c_migrating-mon-from-controller-nodes.adoc | 411 ++++++++++++++++++
 4 files changed, 517 insertions(+), 378 deletions(-)
 create mode 100644 docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc
 delete mode 100644 docs_user/modules/proc_migrating-mon-and-mgr-from-controller-nodes.adoc
 create mode 100644 docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc

diff --git a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc
index 1b5cffd46..823e3b01d 100644
--- a/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc
+++ b/docs_user/assemblies/assembly_migrating-ceph-rbd.adoc
@@ -11,7 +11,16 @@ To migrate Red Hat Ceph Storage Rados Block Device (RBD), your environment must
 * {Ceph} is running version 6 or later and is managed by cephadm/orchestrator.
 * NFS (ganesha) is migrated from a {OpenStackPreviousInstaller}-based deployment to cephadm. For more information, see xref:creating-a-ceph-nfs-cluster_migrating-databases[Creating a NFS Ganesha cluster]. 
 * Both the {Ceph} public and cluster networks are propagated, with {OpenStackPreviousInstaller}, to the target nodes.
-* Ceph Monitors need to keep their IPs to avoid cold migration.
+* Ceph MDS, Ceph Monitoring stack, Ceph MDS, Ceph RGW and other services have been migrated already to the target nodes;
+ifeval::["{build}" != "downstream"]
+* The daemons distribution follows the cardinality constraints described in the doc link:https://access.redhat.com/articles/1548993[Red Hat Ceph Storage: Supported configurations]
+endif::[]
+* The Ceph cluster is healthy, and the `ceph -s` command returns `HEALTH_OK`
+* The procedure keeps the mon IP addresses by moving them to the {Ceph} nodes
+* Drain the existing Controller nodes
+* Deploy additional monitors to the existing nodes, and promote them as
+_admin nodes that administrators can use to manage the {CephCluster} cluster and perform day 2 operations against it.
 
-include::../modules/proc_migrating-mon-and-mgr-from-controller-nodes.adoc[leveloffset=+1]
+include::../modules/proc_migrating-mgr-from-controller-nodes.adoc[leveloffset=+1]
+include::../modules/proc_migrating-mon-from-controller-nodes.adoc[leveloffset=+1]
 
diff --git a/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc b/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc
new file mode 100644
index 000000000..8578f4204
--- /dev/null
+++ b/docs_user/modules/proc_migrating-mgr-from-controller-nodes.adoc
@@ -0,0 +1,95 @@
+[id="migrating-mgr-from-controller-nodes_{context}"]
+
+= Migrating Ceph Mgr daemons to {Ceph} nodes
+
+The following section describes how to move Ceph Mgr daemons from the
+OpenStack controller nodes to a set of target nodes. Target nodes might be
+pre-existing {Ceph} nodes, or OpenStack Compute nodes if Ceph is deployed by
+{OpenStackPreviousInstaller} with an HCI topology.
+
+.Prerequisites
+
+Configure the target nodes (CephStorage or ComputeHCI) to have both `storage`
+and `storage_mgmt` networks to ensure that you can use both {Ceph} public and
+cluster networks from the same node. This step requires you to interact with
+{OpenStackPreviousInstaller}. From {rhos_prev_long} {rhos_prev_ver} and later
+you do not have to run a stack update.
+
+.Procedure
+
+This procedure assumes that cephadm and the orchestrator are the tools that
+drive the Ceph Mgr migration. As done with the other Ceph daemons (MDS,
+Monitoring and MDS), the procedure uses the Ceph spec to modify the placement
+and reschedule the daemons. Ceph Mgr is run in an active/passive fashion, and
+it's also responsible to provide many modules, including the orchestrator.
+
+. Before start the migration, ssh into the target node and enable the firewall
+rules required to reach a Mgr service.
++
+----
+dports="6800:7300"
+ssh heat-admin@<target_node> sudo iptables -I INPUT \
+    -p tcp --match multiport --dports $dports -j ACCEPT;
+----
+
+[NOTE]
+Repeat the previous action for each target_node.
+
+. Check the rules are properly applied and persist them:
++
+----
+sudo iptables-save
+sudo systemctl restart iptables
+----
+
+. Prepare the target node to host the new Ceph Mgr daemon, and add the `mgr`
+label to the target node:
++
+----
+ceph orch host label add <target_node> mgr; done
+----
+
+- Replace <target_node> with the hostname of the host listed in the Ceph cluster
+  through the `ceph orch host ls` command.
+
+Repeat this action for each node that will be host a Ceph Mgr daemon.
+
+Get the Ceph Mgr spec and update the `placement` section to use `label` as the
+main scheduling strategy.
+
+. Get the Ceph Mgr spec:
++
+----
+sudo cephadm shell -- ceph orch ls --export mgr > mgr.yaml
+----
+
+.Edit the retrieved spec and add the `label: mgr` section:
++
+[source,yaml]
+----
+service_type: mgr
+service_id: mgr
+placement:
+  label: mgr
+----
+
+. Save the spec in /tmp/mgr.yaml
+. Apply the spec with cephadm using the orchestrator:
++
+----
+sudo cephadm shell -m /tmp/mgr.yaml -- ceph orch apply -i /mnt/mgr.yaml
+----
+
+According to the numner of nodes where the `mgr` label is added, you will see a
+Ceph Mgr daemon count that matches the number of hosts.
+
+. Verify new Ceph Mgr have been created in the target_nodes:
++
+----
+ceph orch ps | grep -i mgr
+----
++
+[NOTE]
+The procedure does not shrink the Ceph Mgr daemons: the count is grown by the
+number of target nodes, and the xref:migrating-mon-from-controller-nodes[Ceph Mon migration procedure]
+will decommission the stand-by Ceph Mgr instances.
diff --git a/docs_user/modules/proc_migrating-mon-and-mgr-from-controller-nodes.adoc b/docs_user/modules/proc_migrating-mon-and-mgr-from-controller-nodes.adoc
deleted file mode 100644
index b2827a58a..000000000
--- a/docs_user/modules/proc_migrating-mon-and-mgr-from-controller-nodes.adoc
+++ /dev/null
@@ -1,376 +0,0 @@
-[id="migrating-mon-and-mgr-from-controller-nodes_{context}"]
-
-= Migrating Ceph Monitor and Ceph Manager daemons to {Ceph} nodes
-//kgilliga: This procedure needs to be revisited. It should not be a POC.
-Migrate your Ceph Monitor daemons, Ceph Manager daemons, and object storage daemons (OSDs) from your {rhos_prev_long} Controller nodes to existing {Ceph} nodes. During the migration, ensure that you can do the following actions:
-
-* Keep the mon IP addresses by moving them to the {Ceph} nodes.
-* Drain the existing Controller nodes and shut them down.
-* Deploy additional monitors to the existing nodes, and promote them as
-_admin nodes that administrators can use to manage the {CephCluster} cluster and perform day 2 operations against it.
-* Keep the {CephCluster} cluster operational during the migration.
-
-The following procedure shows an example migration from a Controller node (`oc0-controller-1`) and a {Ceph} node (`oc0-ceph-0`). Use the names of the nodes in your environment. 
-
-.Prerequisites
-
-* Configure the Storage nodes to have both storage and storage_mgmt
-network to ensure that you can use both {Ceph} public and cluster networks. This step requires you to interact with {OpenStackPreviousInstaller}. From {rhos_prev_long} {rhos_prev_ver} and later you do not have to run a stack update. However, there are commands that you must perform to run `os-net-config` on the bare metal node and configure additional networks.
-
-.. Ensure that the network is defined in the `metalsmith.yaml` for the CephStorageNodes:
-+
-[source,yaml]
-----
-  - name: CephStorage
-    count: 2
-    instances:
-      - hostname: oc0-ceph-0
-        name: oc0-ceph-0
-      - hostname: oc0-ceph-1
-        name: oc0-ceph-1
-    defaults:
-      networks:
-        - network: ctlplane
-          vif: true
-        - network: storage_cloud_0
-            subnet: storage_cloud_0_subnet
-        - network: storage_mgmt_cloud_0
-            subnet: storage_mgmt_cloud_0_subnet
-      network_config:
-        template: templates/single_nic_vlans/single_nic_vlans_storage.j2
-----
-
-.. Run the following command:
-+
-----
-openstack overcloud node provision \
-  -o overcloud-baremetal-deployed-0.yaml --stack overcloud-0 \
-  --network-config -y --concurrency 2 /home/stack/metalsmith-0.yam
-----
-
-.. Verify that the storage network is running on the node:
-+
-----
-(undercloud) [CentOS-9 - stack@undercloud ~]$ ssh heat-admin@192.168.24.14 ip -o -4 a
-Warning: Permanently added '192.168.24.14' (ED25519) to the list of known hosts.
-1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
-5: br-storage    inet 192.168.24.14/24 brd 192.168.24.255 scope global br-storage\       valid_lft forever preferred_lft forever
-6: vlan1    inet 192.168.24.14/24 brd 192.168.24.255 scope global vlan1\       valid_lft forever preferred_lft forever
-7: vlan11    inet 172.16.11.172/24 brd 172.16.11.255 scope global vlan11\       valid_lft forever preferred_lft forever
-8: vlan12    inet 172.16.12.46/24 brd 172.16.12.255 scope global vlan12\       valid_lft forever preferred_lft forever
-----
-
-.Procedure
-
-. To migrate mon(s) and mgr(s) on the two existing {Ceph} nodes, create a {Ceph} spec based on the default roles with the mon/mgr on the controller nodes.
-+
-----
-openstack overcloud ceph spec -o ceph_spec.yaml -y  \
-   --stack overcloud-0     overcloud-baremetal-deployed-0.yaml
-----
-
-. Deploy the {CephCluster} cluster:
-+
-----
- openstack overcloud ceph deploy overcloud-baremetal-deployed-0.yaml \
-    --stack overcloud-0 -o deployed_ceph.yaml \
-    --network-data ~/oc0-network-data.yaml \
-    --ceph-spec ~/ceph_spec.yaml
-----
-+
-[NOTE]
-The `ceph_spec.yaml`, which is the OSP-generated description of the {CephCluster} cluster,
-will be used, later in the process, as the basic template required by cephadm to update the status/info of the daemons.
-
-. Check the status of the {CephCluster} cluster:
-+
-----
-[ceph: root@oc0-controller-0 /]# ceph -s
-  cluster:
-    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
-    health: HEALTH_OK
-
-  services:
-    mon: 3 daemons, quorum oc0-controller-0,oc0-controller-1,oc0-controller-2 (age 19m)
-    mgr: oc0-controller-0.xzgtvo(active, since 32m), standbys: oc0-controller-1.mtxohd, oc0-controller-2.ahrgsk
-    osd: 8 osds: 8 up (since 12m), 8 in (since 18m); 1 remapped pgs
-
-  data:
-    pools:   1 pools, 1 pgs
-    objects: 0 objects, 0 B
-    usage:   43 MiB used, 400 GiB / 400 GiB avail
-    pgs:     1 active+clean
-----
-+
-----
-[ceph: root@oc0-controller-0 /]# ceph orch host ls
-HOST              ADDR           LABELS          STATUS
-oc0-ceph-0        192.168.24.14  osd
-oc0-ceph-1        192.168.24.7   osd
-oc0-controller-0  192.168.24.15  _admin mgr mon
-oc0-controller-1  192.168.24.23  _admin mgr mon
-oc0-controller-2  192.168.24.13  _admin mgr mon
-----
-
-. Log in to the `controller-0` node, then 
-//kgilliga: Need more description of what is happening in this step.
-+
-----
-cephadm shell -v /home/ceph-admin/specs:/specs
-----
-
-. Log in to the `ceph-0` node, then
-//kgilliga: Need more description of what is happening in this step.
-+
-----
-sudo “watch podman ps”  # watch the new mon/mgr being deployed here
-----
-
-. Optional: If mgr is active in the source node, then:
-+
-----
-ceph mgr fail <mgr instance>
-----
-
-. From the cephadm shell, remove the labels on `oc0-controller-1`:
-+
-----
-    for label in mon mgr _admin; do
-           ceph orch host rm label oc0-controller-1 $label;
-    done
-----
-
-. Add the missing labels to `oc0-ceph-0`:
-+
-----
-[ceph: root@oc0-controller-0 /]#
-> for label in mon mgr _admin; do ceph orch host label add oc0-ceph-0 $label; done
-Added label mon to host oc0-ceph-0
-Added label mgr to host oc0-ceph-0
-Added label _admin to host oc0-ceph-0
-----
-
-. Drain and force-remove the `oc0-controller-1` node:
-+
-----
-[ceph: root@oc0-controller-0 /]# ceph orch host drain oc0-controller-1
-Scheduled to remove the following daemons from host 'oc0-controller-1'
-type                 id
--------------------- ---------------
-mon                  oc0-controller-1
-mgr                  oc0-controller-1.mtxohd
-crash                oc0-controller-1
-----
-+
-----
-[ceph: root@oc0-controller-0 /]# ceph orch host rm oc0-controller-1 --force
-Removed  host 'oc0-controller-1'
-
-[ceph: root@oc0-controller-0 /]# ceph orch host ls
-HOST              ADDR           LABELS          STATUS
-oc0-ceph-0        192.168.24.14  osd
-oc0-ceph-1        192.168.24.7   osd
-oc0-controller-0  192.168.24.15  mgr mon _admin
-oc0-controller-2  192.168.24.13  _admin mgr mon
-----
-
-. If you have only 3 mon nodes, and the drain of the node doesn't work as
-expected (the containers are still there), then log in to controller-1 and
-force-purge the containers in the node:
-+
-----
-[root@oc0-controller-1 ~]# sudo podman ps
-CONTAINER ID  IMAGE                                                                                        COMMAND               CREATED         STATUS             PORTS       NAMES
-ifeval::["{build}" != "downstream"]
-5c1ad36472bc  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-controller-1
-3b14cc7bf4dd  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mgr.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-oc0-controller-1-mtxohd
-endif::[]
-ifeval::["{build}" == "downstream"]
-5c1ad36472bc  registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-controller-1
-3b14cc7bf4dd  registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mgr.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-oc0-controller-1-mtxohd
-endif::[]
-
-[root@oc0-controller-1 ~]# cephadm rm-cluster --fsid f6ec3ebe-26f7-56c8-985d-eb974e8e08e3 --force
-
-[root@oc0-controller-1 ~]# sudo podman ps
-CONTAINER ID  IMAGE       COMMAND     CREATED     STATUS      PORTS       NAMES
-----
-+
-[NOTE]
-Cephadm rm-cluster on a node that is not part of the cluster anymore has the
-effect of removing all the containers and doing some cleanup on the filesystem.
-
-. Before shutting the oc0-controller-1 down, move the IP address (on the same
-network) to the oc0-ceph-0 node:
-+
-----
-mon_host = [v2:172.16.11.54:3300/0,v1:172.16.11.54:6789/0] [v2:172.16.11.121:3300/0,v1:172.16.11.121:6789/0] [v2:172.16.11.205:3300/0,v1:172.16.11.205:6789/0]
-
-[root@oc0-controller-1 ~]# ip -o -4 a
-1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
-5: br-ex    inet 192.168.24.23/24 brd 192.168.24.255 scope global br-ex\       valid_lft forever preferred_lft forever
-6: vlan100    inet 192.168.100.96/24 brd 192.168.100.255 scope global vlan100\       valid_lft forever preferred_lft forever
-7: vlan12    inet 172.16.12.154/24 brd 172.16.12.255 scope global vlan12\       valid_lft forever preferred_lft forever
-8: vlan11    inet 172.16.11.121/24 brd 172.16.11.255 scope global vlan11\       valid_lft forever preferred_lft forever
-9: vlan13    inet 172.16.13.178/24 brd 172.16.13.255 scope global vlan13\       valid_lft forever preferred_lft forever
-10: vlan70    inet 172.17.0.23/20 brd 172.17.15.255 scope global vlan70\       valid_lft forever preferred_lft forever
-11: vlan1    inet 192.168.24.23/24 brd 192.168.24.255 scope global vlan1\       valid_lft forever preferred_lft forever
-12: vlan14    inet 172.16.14.223/24 brd 172.16.14.255 scope global vlan14\       valid_lft forever preferred_lft forever
-----
-
-. On the oc0-ceph-0, add the IP address of the mon that has been deleted from `controller-0`, and verify that the IP address has been assigned and can be reached: 
-//kgilliga: Revisit this step. Do we need the [heat-admin @oc0-ceph-0 ~]$ ip -o -4 a] code block? Is that code block an example of the output?
-+
-----
-$ sudo ip a add 172.16.11.121 dev vlan11
-$ ip -o -4 a
-----
-+
-----
-[heat-admin@oc0-ceph-0 ~]$ ip -o -4 a
-1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
-5: br-storage    inet 192.168.24.14/24 brd 192.168.24.255 scope global br-storage\       valid_lft forever preferred_lft forever
-6: vlan1    inet 192.168.24.14/24 brd 192.168.24.255 scope global vlan1\       valid_lft forever preferred_lft forever
-7: vlan11    inet 172.16.11.172/24 brd 172.16.11.255 scope global vlan11\       valid_lft forever preferred_lft forever
-8: vlan12    inet 172.16.12.46/24 brd 172.16.12.255 scope global vlan12\       valid_lft forever preferred_lft forever
-[heat-admin@oc0-ceph-0 ~]$ sudo ip a add 172.16.11.121 dev vlan11
-[heat-admin@oc0-ceph-0 ~]$ ip -o -4 a
-1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
-5: br-storage    inet 192.168.24.14/24 brd 192.168.24.255 scope global br-storage\       valid_lft forever preferred_lft forever
-6: vlan1    inet 192.168.24.14/24 brd 192.168.24.255 scope global vlan1\       valid_lft forever preferred_lft forever
-7: vlan11    inet 172.16.11.172/24 brd 172.16.11.255 scope global vlan11\       valid_lft forever preferred_lft forever
-7: vlan11    inet 172.16.11.121/32 scope global vlan11\       valid_lft forever preferred_lft forever
-8: vlan12    inet 172.16.12.46/24 brd 172.16.12.255 scope global vlan12\       valid_lft forever preferred_lft forever
-----
-
-. Optional: Power off oc0-controller-1.
-//kgilliga: What is the reason for powering off the controller (or not)?
-
-. Add the new mon on oc0-ceph-0 using the old IP address:
-+
-----
-[ceph: root@oc0-controller-0 /]# ceph orch daemon add mon oc0-ceph-0:172.16.11.121
-Deployed mon.oc0-ceph-0 on host 'oc0-ceph-0'
-----
-
-. Check the new container in the oc0-ceph-0 node:
-+
-----
-ifeval::["{build}" != "downstream"]
-b581dc8bbb78  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.oc0-ceph-0...  24 seconds ago  Up 24 seconds ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-ceph-0
-endif::[]
-ifeval::["{build}" == "downstream"]
-b581dc8bbb78  registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.oc0-ceph-0...  24 seconds ago  Up 24 seconds ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-ceph-0
-endif::[]
-----
-
-. On the cephadm shell, backup the existing ceph_spec.yaml, edit the spec
-removing any oc0-controller-1 entry, and replacing it with oc0-ceph-0:
-+
-----
-cp ceph_spec.yaml ceph_spec.yaml.bkp # backup the ceph_spec.yaml file
-
-[ceph: root@oc0-controller-0 specs]# diff -u ceph_spec.yaml.bkp ceph_spec.yaml
-
---- ceph_spec.yaml.bkp  2022-07-29 15:41:34.516329643 +0000
-+++ ceph_spec.yaml      2022-07-29 15:28:26.455329643 +0000
-@@ -7,14 +7,6 @@
- - mgr
- service_type: host
- ---
--addr: 192.168.24.12
--hostname: oc0-controller-1
--labels:
--- _admin
--- mon
--- mgr
--service_type: host
- ----
- addr: 192.168.24.19
- hostname: oc0-controller-2
- labels:
-@@ -38,7 +30,7 @@
- placement:
-   hosts:
-   - oc0-controller-0
--  - oc0-controller-1
-+  - oc0-ceph-0
-   - oc0-controller-2
- service_id: mon
- service_name: mon
-@@ -47,8 +39,8 @@
- placement:
-   hosts:
-   - oc0-controller-0
--  - oc0-controller-1
-   - oc0-controller-2
-+  - oc0-ceph-0
- service_id: mgr
- service_name: mgr
- service_type: mgr
-----
-
-. Apply the resulting spec:
-+
-----
-ceph orch apply -i ceph_spec.yaml
-
- The result of 12 is having a new mgr deployed on the oc0-ceph-0 node, and the spec reconciled within cephadm
-
-[ceph: root@oc0-controller-0 specs]# ceph orch ls
-NAME                     PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
-crash                               4/4  5m ago     61m  *
-mgr                                 3/3  5m ago     69s  oc0-controller-0;oc0-ceph-0;oc0-controller-2
-mon                                 3/3  5m ago     70s  oc0-controller-0;oc0-ceph-0;oc0-controller-2
-osd.default_drive_group               8  2m ago     69s  oc0-ceph-0;oc0-ceph-1
-
-[ceph: root@oc0-controller-0 specs]# ceph -s
-  cluster:
-    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
-    health: HEALTH_WARN
-            1 stray host(s) with 1 daemon(s) not managed by cephadm
-
-  services:
-    mon: 3 daemons, quorum oc0-controller-0,oc0-controller-2,oc0-ceph-0 (age 5m)
-    mgr: oc0-controller-0.xzgtvo(active, since 62m), standbys: oc0-controller-2.ahrgsk, oc0-ceph-0.hccsbb
-    osd: 8 osds: 8 up (since 42m), 8 in (since 49m); 1 remapped pgs
-
-  data:
-    pools:   1 pools, 1 pgs
-    objects: 0 objects, 0 B
-    usage:   43 MiB used, 400 GiB / 400 GiB avail
-    pgs:     1 active+clean
-----
-
-. Fix the warning by refreshing the mgr:
-+
-----
-ceph mgr fail oc0-controller-0.xzgtvo
-----
-+
-At this point the {CephCluster} cluster is clean:
-+
-----
-[ceph: root@oc0-controller-0 specs]# ceph -s
-  cluster:
-    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
-    health: HEALTH_OK
-
-  services:
-    mon: 3 daemons, quorum oc0-controller-0,oc0-controller-2,oc0-ceph-0 (age 7m)
-    mgr: oc0-controller-2.ahrgsk(active, since 25s), standbys: oc0-controller-0.xzgtvo, oc0-ceph-0.hccsbb
-    osd: 8 osds: 8 up (since 44m), 8 in (since 50m); 1 remapped pgs
-
-  data:
-    pools:   1 pools, 1 pgs
-    objects: 0 objects, 0 B
-    usage:   43 MiB used, 400 GiB / 400 GiB avail
-    pgs:     1 active+clean
-----
-+
-The `oc0-controller-1` is removed and powered off without leaving traces on the {CephCluster} cluster.
-
-. Repeat this procedure for additional Controller nodes in your environment until you have migrated all the Ceph Mon and Ceph Manager daemons to the target nodes.
-
-
-
diff --git a/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc b/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc
new file mode 100644
index 000000000..27e1274c2
--- /dev/null
+++ b/docs_user/modules/proc_migrating-mon-from-controller-nodes.adoc
@@ -0,0 +1,411 @@
+[id="migrating-mon-from-controller-nodes_{context}"]
+
+= Migrating Ceph Monitor and Ceph Manager daemons to {Ceph} nodes
+
+The following section describes how to move Ceph Monitor daemons from the
+OpenStack controller nodes to a set of target nodes. Target nodes might be
+pre-existing {Ceph} nodes, or OpenStack Compute nodes if Ceph is deployed by
+{OpenStackPreviousInstaller} with an HCI topology.
+
+.Prerequisites
+
+Configure the target nodes (CephStorage or ComputeHCI) to have both `storage`
+and `storage_mgmt` networks to ensure that you can use both {Ceph} public and
+cluster networks from the same node. This step requires you to interact with
+{OpenStackPreviousInstaller}. From {rhos_prev_long} {rhos_prev_ver} and later
+you do not have to run a stack update. However, there are commands that you
+must perform to run `os-net-config` on the bare metal node and configure
+additional networks.
+
+.. If target nodes are `CephStorage`, ensure that the network is defined in the
+`metalsmith.yaml` for the CephStorageNodes:
++
+[source,yaml]
+----
+  - name: CephStorage
+    count: 2
+    instances:
+      - hostname: oc0-ceph-0
+        name: oc0-ceph-0
+      - hostname: oc0-ceph-1
+        name: oc0-ceph-1
+    defaults:
+      networks:
+        - network: ctlplane
+          vif: true
+        - network: storage_cloud_0
+            subnet: storage_cloud_0_subnet
+        - network: storage_mgmt_cloud_0
+            subnet: storage_mgmt_cloud_0_subnet
+      network_config:
+        template: templates/single_nic_vlans/single_nic_vlans_storage.j2
+----
+
+.. Run the following command:
++
+----
+openstack overcloud node provision \
+  -o overcloud-baremetal-deployed-0.yaml --stack overcloud-0 \
+  --network-config -y --concurrency 2 /home/stack/metalsmith-0.yam
+----
+
+.. Verify that the storage network is configured on the target nodes:
++
+----
+(undercloud) [stack@undercloud ~]$ ssh heat-admin@192.168.24.14 ip -o -4 a
+1: lo    inet 127.0.0.1/8 scope host lo\       valid_lft forever preferred_lft forever
+5: br-storage    inet 192.168.24.14/24 brd 192.168.24.255 scope global br-storage\       valid_lft forever preferred_lft forever
+6: vlan1    inet 192.168.24.14/24 brd 192.168.24.255 scope global vlan1\       valid_lft forever preferred_lft forever
+7: vlan11    inet 172.16.11.172/24 brd 172.16.11.255 scope global vlan11\       valid_lft forever preferred_lft forever
+8: vlan12    inet 172.16.12.46/24 brd 172.16.12.255 scope global vlan12\       valid_lft forever preferred_lft forever
+----
+
+.Procedure
+
+This procedure assumes that some of the steps are run on the source node that
+we want to decommission, while other steps are run on the target node that is
+supposed to host the redeployed daemon.
+
+. Before start the migration, ssh into the target node and enable the firewall
+rules required to reach a Mon service:
++
+----
+for port in 3300 6789; {
+    ssh heat-admin@<target_node> sudo iptables -I INPUT \
+    -p tcp -m tcp --dport $port -m conntrack --ctstate NEW \
+    -j ACCEPT;
+}
+----
+
+- Replace <target_node> with the hostname of the node that is supposed to host
+the new mon
+
+. Check the rules are properly applied and persist them:
++
+----
+sudo iptables-save
+sudo systemctl restart iptables
+----
+
+. To migrate the existing Mons to the target {Ceph} nodes, the first step is to
+create the following {Ceph} spec from the first mon (or the first controller)
+and modify the placement based on the appropriate label.
++
+[source,yaml]
+----
+service_type: mon
+service_id: mon
+placement:
+  label: mon
+----
+
+. Save the spec in /tmp/mon.yaml
+. Apply the spec with cephadm using the orchestrator:
++
+----
+sudo cephadm shell -m /tmp/mon.yaml
+ceph orch apply -i /mnt/mon.yaml
+----
++
+[NOTE]
+The effect of applying the `mon.yaml` spec is to normalize the existng placement
+strategy to use `labels` instead of `hosts`. By doing this any node with the `mon`
+label is able to host a Ceph mon daemon.
+The step above can be executed once, and it shouldn't be repeated in case of
+multiple iterations over this procedure when multiple Ceph Mons are migrated.
+
+. Before moving to the next step, check the status of the {CephCluster} and the
+orchestrator daemons list: make sure the three mons are in quorum and listed by
+the `ceph orch` command:
++
+----
+# ceph -s
+  cluster:
+    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
+    health: HEALTH_OK
+
+  services:
+    mon: 3 daemons, quorum oc0-controller-0,oc0-controller-1,oc0-controller-2 (age 19m)
+    mgr: oc0-controller-0.xzgtvo(active, since 32m), standbys: oc0-controller-1.mtxohd, oc0-controller-2.ahrgsk
+    osd: 8 osds: 8 up (since 12m), 8 in (since 18m); 1 remapped pgs
+
+  data:
+    pools:   1 pools, 1 pgs
+    objects: 0 objects, 0 B
+    usage:   43 MiB used, 400 GiB / 400 GiB avail
+    pgs:     1 active+clean
+----
++
+----
+[ceph: root@oc0-controller-0 /]# ceph orch host ls
+HOST              ADDR           LABELS          STATUS
+oc0-ceph-0        192.168.24.14  osd
+oc0-ceph-1        192.168.24.7   osd
+oc0-controller-0  192.168.24.15  _admin mgr mon
+oc0-controller-1  192.168.24.23  _admin mgr mon
+oc0-controller-2  192.168.24.13  _admin mgr mon
+----
+
+. On the source node, backup the `/etc/ceph/` directory: this allows, in case
+of issues, to have the ability to execute cephadm and get a shell to the ceph
+cluster from the source node:
++
+----
+mkdir -p $HOME/ceph_client_backup
+sudo cp -R /etc/ceph $HOME/ceph_client_backup
+----
+
+. Before draining the source node and relocating the IP address of the storage
+network to to the target node, fail the mgr if is active on the source node:
++
+----
+ceph mgr fail <mgr instance>
+----
+
+. Drain the source node and start the mon migration. From the cephadm shell,
+remove the labels on source node:
++
+----
+    for label in mon mgr _admin; do
+           ceph orch host rm label <source_node> $label;
+    done
+----
+
+. Remove the running mon daemon from the source node:
++
+----
+cephadm shell -- ceph orch daemon rm mon.<source_node> --force"
+----
+
+. Run the drain command:
++
+----
+cephadm shell -- ceph drain <source_node>
+----
+
+. Remove the source_node host from the {CephCluster}:
++
+----
+cephadm shell -- ceph orch host rm <source_node> --force"
+----
+
+- Replace <source_node> with the hostname of the source node.
+
+The source node is not part of the cluster anymore, and shouldn't appear in the
+Ceph host list when running `cephadm shell -- ceph orch host ls`.
+However, a `sudo podman ps` in the source node might list both mon and mgr still
+running.
+
+----
+[root@oc0-controller-1 ~]# sudo podman ps
+CONTAINER ID  IMAGE                                                                                        COMMAND               CREATED         STATUS             PORTS       NAMES
+ifeval::["{build}" != "downstream"]
+5c1ad36472bc  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-controller-1
+3b14cc7bf4dd  quay.io/ceph/daemon@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mgr.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-oc0-controller-1-mtxohd
+endif::[]
+ifeval::["{build}" == "downstream"]
+5c1ad36472bc  registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mon.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mon-oc0-controller-1
+3b14cc7bf4dd  registry.redhat.io/ceph/rhceph@sha256:320c364dcc8fc8120e2a42f54eb39ecdba12401a2546763b7bef15b02ce93bc4  -n mgr.oc0-contro...  35 minutes ago  Up 35 minutes ago              ceph-f6ec3ebe-26f7-56c8-985d-eb974e8e08e3-mgr-oc0-controller-1-mtxohd
+endif::[]
+----
+
+To cleanup the source node before moving to the next phase, cleanup the existing
+containers and remove the cephadm related data from the node.
+// fpantano: there's an automated procedure run through cephadm but it's too
+// risky. If the user doesn't perform it properly the cluster can be affected.
+// We can put a downstream comment to contact the RH support to clean the source
+// node up in case of leftovers, and open a bug for cephadm.
+
+//. ssh into one of the existing Ceph mons (usually controller-1 or controller-2)
+. Prepare the target node to host the new mon and add the `mon` label to the
+target node:
++
+----
+for label in mon mgr _admin; do
+    ceph orch host label add <target_node> $label; done
+done
+----
+
+- Replace <target_node> with the hostname of the host listed in the {CephCluster}
+  through the `ceph orch host ls` command.
+
+At this point the cluster is running with only two mons, but a third mon appears
+and will be deployed on the target node.
+However, The third mon might be deployed on a different ip address available in
+the node, and you need to redeploy it when the ip migration is concluded.
+Even though the mon is deployed on the wrong ip address, it's useful keep the
+quorum to three and it ensures we do not risk to lose the cluster because two
+mons go in split brain.
+
+. Confirm the cluster has three mons and they're in quorum:
++
+----
+cephadm shell -- ceph -s
+cephamd shell -- ceph orch ps | grep -i mon
+----
+
+It's now possible to migrate the original mon ip address to the target node and
+redeploy the existing mon on it.
+The following IP address migration procedure assumes that the target nodes have
+been originally deployed by {OpenStackPreviousInstaller} and the network configuration
+is managed by `os-net-config`.
+
+// NOTE (fpantano): we need to document the same ip address migration procedure
+// w/ an EDPM node that has already been adopted.
+
+. Get the mon ip address from the existing `/etc/ceph/ceph.conf` (check the `mon_host`
+line), for example:
++
+----
+mon_host = [v2:172.17.3.60:3300/0,v1:172.17.3.60:6789/0] [v2:172.17.3.29:3300/0,v1:172.17.3.29:6789/0] [v2:172.17.3.53:3300/0,v1:172.17.3.53:6789/0]
+----
+
+. Confirm the mon ip address is present on the source node `os-net-config`
+configuration located in `/etc/os-net-config`:
++
+----
+
+[tripleo-admin@controller-0 ~]$ grep "172.17.3.60" /etc/os-net-config/config.yaml
+    - ip_netmask: 172.17.3.60/24
+----
+
+. Edit the config file `/etc/os-net-config/config.yaml` and remove the
+`ip_netmask` line retrieved in the previous step.
+
+. Save the file and refresh the node network configuration with the following
+command:
++
+----
+sudo os-net-config -c /etc/os-net-config/config.yaml
+----
+
+. Verify, using the `ip` command, that the IP address is not present in the source
+node anymore.
+
+ssh into the target node, for example `cephstorage-0`, and add the ip address
+that will be bound to the new mon.
+
+. On the target node, edit the config file `/etc/os-net-config/config.yaml` and
+add the `- ip_netmask: 172.17.3.60` line removed in the source node.
+
+. Save the file and refresh the node network configuration with the following
+command:
++
+----
+sudo os-net-config -c /etc/os-net-config/config.yaml
+----
+
+. Verify, using the `ip` command, that the IP address is present in the target
+node.
+
+Get the Ceph spec and set the mon daemons to `unmanaged`:
+
+. Get the ceph mon spec:
++
+----
+ceph orch ls --export mon > mon.yaml
+----
+
+.Edit the retrieved spec and add the `unamanged: true` keyword:
++
+[source,yaml]
+----
+service_type: mon
+service_id: mon
+placement:
+  label: mon
+unamanged: true
+----
+
+. Save the spec in /tmp/mon.yaml
+. Apply the spec with cephadm using the orchestrator:
++
+----
+sudo cephadm shell -m /tmp/mon.yaml
+ceph orch apply -i /mnt/mon.yaml
+----
+
+The mon daemons are marked as <unmanaged>, and it's now possible to redeploy
+the existing daemon and bind it to the migrated ip address.
+
+. Delete the existing mon on the target node:
++
+----
+$ ceph orch daemon add rm mon.<target_node> --force
+----
+
+. Redeploy the new mon on the target using the old IP address:
++
+----
+$ ceph orch daemon add mon <target_node>:<ip address>
+----
+
+- Replace <target_node> with the hostname of the target node enrolled in the Ceph
+cluster
+- Replace <ip address> with the ip address of the migrated address
+
+Get the Ceph spec and set the mon daemons to `unmanaged: false`:
+
+. Get the ceph mon spec:
++
+----
+ceph orch ls --export mon > mon.yaml
+----
+
+.Edit the retrieved spec and set the `unamanged` keyword to `false`:
++
+[source,yaml]
+----
+service_type: mon
+service_id: mon
+placement:
+  label: mon
+unamanged: false
+----
+
+. Save the spec in /tmp/mon.yaml
+. Apply the spec with cephadm using the orchestrator:
++
+----
+sudo cephadm shell -m /tmp/mon.yaml
+ceph orch apply -i /mnt/mon.yaml
+----
+
+The new mon runs on the target node with the original IP address.
+As last step of the mon migration, you need to refresh the cephadm information
+and reconfigure the existing daemons to exchange the map with the updated mon
+references.
+
+. Identify the running `mgr`:
++
+----
+sudo cephadm shell -- ceph -s
+----
+
+. Refresh the mgr information by force-failing it:
++
+----
+ceph mgr fail
+----
+
+. Refresh the `OSD` information:
++
+----
+ceph orch reconfig osd.default_drive_group
+----
+
++
+Verify that at this point the {CephCluster} cluster is healthy:
++
+----
+[ceph: root@oc0-controller-0 specs]# ceph -s
+  cluster:
+    id:     f6ec3ebe-26f7-56c8-985d-eb974e8e08e3
+    health: HEALTH_OK
+...
+...
+----
+
+. Repeat the procedure described in this section for any additional Controller
+node hosting a mon until you have migrated all the Ceph Mon daemons to the
+target nodes.