-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can the CSI RBD plugin catch up in terms of performance to native CLI calls? #449
Comments
First test, to compare numbers generated using PR #443 and the script below, that generate exact nature and order of calls as the plugin, is provided in this comment. Script to mimic what the plugin does as of commit hash Note: Script maybe a bit rough but works, execute a create like so Note: Following numbers were generated on the same setup as numbers generated in this comment. Further the script was executed from within the container csi-rbdplugin in the pod csi-rbdplugin-provisioner-0, to keep things as close to the plugin. Script times: Further, modified the script to not run the create/delete in parallel and instead do a serial create/delete (just for a baseline), and following are the results from the same, Observations:
|
RBD deletes are highly dependent on the image size since it requires issuing RADOS deletes for every possible backing object (regardless of whether or not they actually exists). Therefore, 1 GiB image deletion could be an order of magnitude faster than a 10 GiB image. Longer term, that is why I am proposing moving long-running operations to a new MGR call so that the CSI can "fire and forget". In terms of creation, I think if you re-implemented your script using the rados/rbd Python bindings, you will see a huge drop in time. This just brings up back to the previous talks about eventually replacing all CLI calls w/ golang API binding calls. For larger clusters, just the bootstrap time required for each CLI call (connect to MON, exchange keys, pull maps, etc) can be the vast majority of runtime. |
With a go lang program ceph-golib.go.txt that uses go-ceph on a vagrant based kube+Rook(ceph) setup on a laptop the following measures were taken, NOTE: timing is coarse as it is based on Test details: 3 Runs per test, each run is an iteration of 25 creates or deletes (IOW 25 RBD images created and deleted in the end, including its associated RADOS OMap updates) go-ceph based times for 3 runs: using the script as provided earlier invoking the Ceph CLIs |
On the same setup as the above comment, here are some more times based on parallel invocation of creates and deletes from the golang ceph-golib.go version of the program. NOTE: All tests are run for 2 iterations for 25 objects (PVCs/creates/deletes) and average per run is mentioned below, PVC creates in parallel when using CSI drivers (that invoke the various CLIs): golib program in parallel: Script in parallel: NOTE: Further tests would be based on testing in a real cluster to understand contribution factor of PV create/delete/attach times, to really understand if improving this aspect would yield the best value for effort. |
I have modified ceph-csi code to use tested in kube, I was able to create and delete 100 rbd PVC in less 2 minutes Note: This is parallel PVC operation |
@Madhu-1 is it an improvement over running it without the |
using I don't see any performance improvement, we need to do the same testing in other setups to check are we gaining any performance improvement for using the ceph-go library. cirros-create-with-pvc-without-io (copy).txt |
Update: Need to do some further bench marking on a non-laptop setup and also using gRPC to provide data on how this can help CSI operations when operating on parallel requests. |
@Madhu-1 @ShyamsundarR if we are seeing performance improvement on parallel requests and also if we think the ceph go client is stable enough for atleast happy path or create and delete Volume , lets bring it to the repo. I am fine with the same. |
Completed the test and here are the results: Legend:
1) Test results with existing code and added Prometheus gRPC metrics:Actions in parallel: n = 25 NodePublishVolume: 1.27/10 + 0.76/7 + 0.73/8 = 0.11 Actions in parallel: n = 1 Actions in parallel: n = 10 NOTE: to test improvements 2) Test results using this code, that uses ceph-go bindings for just the happy path in
|
could not test with the above-mentioed code, facing issue in PVC create
|
@ShyamsundarR I have created a new image as per our discussion on reusing the connection, I don't see any major overall performance improvement on 100 PVC creation, it took around 57 seconds to bound all PVC's |
The initial
Can you please share the code and the test? Also did you measure gRPC metrics for the same, pre and post change, tests? |
@Madhu-1 , @ShyamsundarR - any updates? |
@mykaul It was decided to pick up this work post 1.2.0 release for the following reasons,
@nixpanic was interested in taking this forward, he may have further comments to provide. |
@nixpanic assigning this task to you as per the discussion. This is a very high priority item for |
@nixpanic can you please update this issue with the PR or WIP patch ? |
|
@nixpanic as the PR (ceph/go-ceph#111) is merged now, may be we could make some more progress here :).. Thanks ! |
Older version of Ceph's librbd do not support fetching the watchers on an image. That has been introduced with Mimic (not Nautilus as mentioned in earlier comments). The So, the question is: does Ceph-CSI want to keep support for older Ceph versions (Luminous), or can we move on to a newer version as minimal dependency? There is a way to implement a fallback in ceph-csi, for the missing functionality in libraries, it can still call the |
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
I think we should move on. |
@nixpanic as discussed we want to support Mimic+ version in ceph-csi, ceph-csi support matrix is here https://github.com/ceph/ceph-csi#ceph-csi-features-and-available-versions |
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: ceph#449 Signed-off-by: Niels de Vos <[email protected]>
This is the initial step for improving performance during provisioning of CSI volumes backed by RBD. While creating a volume, an existing connection to the Ceph cluster is used from the ConnPool. This should speed up the creation of a batch of volumes significantly. Updates: #449 Signed-off-by: Niels de Vos <[email protected]>
There are many things that needed to be done for this Issue. I have created https://github.com/ceph/ceph-csi/projects/3 so that tracking the dependencies is a little easier. Don't hesitate to add more issues/PRs/cards to the project. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation. |
As a part of the discussion in PR #443 it was noted by @dillaman that native CLI calls outperform the plugin by a large factor.
As part of the discussion I noted that the plugin does more work than just an image create, and that there would be kubernetes factors to consider.
As a result this issue is opened to help with the analysis of native calls versus calls made by the plugin to understand and maybe improve the performance of the plugin.
Various experiments may need to be conducted to arrive at an answer, and this issue can hopefully help track the progress.
The text was updated successfully, but these errors were encountered: