Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework Ceph RBD migration documentation #461

Merged
merged 3 commits into from
Jun 7, 2024

Conversation

fmount
Copy link
Contributor

@fmount fmount commented May 15, 2024

This patch represents a rework of the current RBD documentation to move it from being a POC to a procedure that we can follow and test in CI.
In particular:

  • we split the RBD procedure between Ceph Mgr and Ceph Mons migration
  • we rework both MGR and Mon docs to be more similar to the procedures that the user should follow

@fmount fmount requested review from klgill and jistr May 15, 2024 11:39
@fmount
Copy link
Contributor Author

fmount commented May 15, 2024

@klgill still WIP, but let's start looking at the procedure and fix the style (or build a todo list for a follow up).
Let's check the preview first to see how it looks.

@fmount fmount force-pushed the ceph_doc branch 4 times, most recently from b6132d1 to f3792d6 Compare May 15, 2024 12:06
@fmount
Copy link
Contributor Author

fmount commented May 15, 2024

Adding /hold to check and review.
We need to have an intro for the Ceph procedure, and the cardinality section should be a separated one (it's not tied to a particular daemon migration).

Copy link

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/data-plane-adoption for 461,b8c1534e65c2764af3b803cf21d3c99eecb6ec0a

include::assemblies/assembly_migrating-the-object-storage-service.adoc[leveloffset=+1]

include::assemblies/assembly_migrating-ceph-cluster.adoc[leveloffset=+1]
Copy link
Contributor Author

@fmount fmount May 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klgill so the idea here is to have an entrypoint for the main ceph doc, and from there we can include all the sections related to the components:

main osp doc
 -> include ceph
     -> include cardinality
     -> include monitoring
     -> include mds
     -> include rgw
     -> include mgr
     -> include mon

Each of them should be an assembly where we can add an intro and include a module with the associated procedure (if there's a procedure, for example cardinality is critical and it's all about considerations when you plan to migrate your daemons).
If we follow this kind of pattern it results clean and easy to maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can plan and discuss about more updates in a follow up, but the main idea is to have something easy to follow from customers point of view.

@fmount fmount force-pushed the ceph_doc branch 2 times, most recently from 8af21e3 to 97458fe Compare May 20, 2024 10:51
@fmount
Copy link
Contributor Author

fmount commented May 22, 2024

@klgill
Copy link
Contributor

klgill commented May 28, 2024

@fmount FYI I'm going to try to finish my review by EOD Tuesday (US EDT).

@fmount
Copy link
Contributor Author

fmount commented May 28, 2024

@fmount FYI I'm going to try to finish my review by EOD Tuesday (US EDT).

Hi @klgill np at all and thank you for the help with this! Take all the time you need and I'll follow your suggestions while you review the ceph doc!

@klgill
Copy link
Contributor

klgill commented May 30, 2024

@fmount I added some minor changes. After those changes are committed, this iteration can be merged.
Note that in the preview, the numbering in the "Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes" procedure is out of sequence. However, I think we need to discuss breaking up that procedure in another PR because it is very long.
Thanks for your work on this chapter!

@fmount
Copy link
Contributor Author

fmount commented May 30, 2024

@fmount I added some minor changes. After those changes are committed, this iteration can be merged. Note that in the preview, the numbering in the "Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes" procedure is out of sequence. However, I think we need to discuss breaking up that procedure in another PR because it is very long. Thanks for your work on this chapter!

Thank you for the multiple iterations over this text. I agree to have dedicated patches where we review specific components, rbd is big enough and not so easy to review.
Hopefully now we have a better structure and procedure compared to the previous version!

@fmount fmount requested a review from klgill May 31, 2024 20:50
@klgill
Copy link
Contributor

klgill commented Jun 3, 2024

@fmount This LGTM except for the numbering in the "Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes" procedure. Currently, that procedure is more than 30 steps total. It should be split into probably 3 procedures, something like the following:

  • Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes

    • Decommissioning the source node
    • Configuring the target node to host the redeployed daemon
    • Applying the ceph mon spec to new nodes ??

    Do you want to discuss this further and split this chapter in a separate PR?

@fmount fmount force-pushed the ceph_doc branch 2 times, most recently from 218e230 to 8755719 Compare June 3, 2024 16:24
@fmount
Copy link
Contributor Author

fmount commented Jun 3, 2024

@fmount This LGTM except for the numbering in the "Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes" procedure. Currently, that procedure is more than 30 steps total. It should be split into probably 3 procedures, something like the following:

* Migrating Ceph Monitor daemons to Red Hat Ceph Storage nodes
  
  * Decommissioning the source node
  * Configuring the target node to host the redeployed daemon
  * Applying the ceph mon spec to new nodes ??
  
  Do you want to discuss this further and split this chapter in a separate PR?

Hi @klgill yes, I fixed the monitoring stack numbering, so this patch should be good to go now.
I think we can follow up on the other parts with different patches, and discuss how we can split them for better maintenance!
The above plan seems interesting, we have something similar in the ansible playbook:

  1. drain the src node
  2. move the ip address to the target node
  3. redeploy the mon

We might want to use that kind of structure, I'll follow up on that.

@fmount
Copy link
Contributor Author

fmount commented Jun 5, 2024

@klgill @jistr we should be ready to go on this one now, just rebased.

@klgill
Copy link
Contributor

klgill commented Jun 6, 2024

I can merge this if it's ready. @jistr Do you have additional comments?

fmount added 2 commits June 7, 2024 10:01
This patch represents a rework of the current RBD documentation to move
it from a POC to a procedure that we can test in CI.
In particular:
- the procedure is split between Ceph Mgr and Ceph Mons migration
- Ceph MGR and Mon docs are more similar to procedures that
  the user should follow
- the order is fixed as rbd should be last

Signed-off-by: Francesco Pantano <[email protected]>
Ceph assemblies are now better reorganized to follow a simple rule/struct.
A main ceph-cluster migration assembly is included in main, and it contains
a quick intro and the (ordered) list of procedures (including the cardinality
section that is critical here and will be improved in a follow up patch).
This way the ceph doc is very easy to access and maintain. There are also
fixes to wrong references (e.g. horizon != Ceph dashboard).

Signed-off-by: Francesco Pantano <[email protected]>
@klgill klgill merged commit bb5aef5 into openstack-k8s-operators:main Jun 7, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants