Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert journal package omap to use go ceph #1071

Merged
merged 10 commits into from
Jun 22, 2020

Conversation

phlogistonjohn
Copy link
Contributor

@phlogistonjohn phlogistonjohn commented May 15, 2020

Describe what this PR does

Change the way the journal package interacts with omaps by switching from cli based functions to the api bindings of go-ceph. The first few patches basically do one-to-one switch overs of the calls used in voljournal.go. Subsequent patches change the internal interfaces to use the natural batching that the api calls give us, allowing the setting, fetching, or removal of multiple keys in one api call.

Is there anything that requires special attention

Do you have any questions? n/a

Is the change backward compatible?
In theory it should behave exactly as before.

Are there concerns around backward compatibility?
Now that CI is passing, there shouldn't be, but any missed changes can be handled as fixes.

Related issues

Fixes: #434

Future concerns

This series does not handle changing everything in journal to go-ceph. There's more to be done here.
I did not clean up any functions that are now unused in util. I plan to do that in a follow up pr.

@nixpanic nixpanic added the component/util Utility functions shared between CephFS and RBD label May 18, 2020
@phlogistonjohn phlogistonjohn force-pushed the jjm-journal-omap-go-ceph branch 4 times, most recently from 180f674 to e92d6df Compare May 21, 2020 20:03
@phlogistonjohn phlogistonjohn force-pushed the jjm-journal-omap-go-ceph branch 5 times, most recently from 60e1941 to 78cedb4 Compare June 3, 2020 14:21
@phlogistonjohn phlogistonjohn force-pushed the jjm-journal-omap-go-ceph branch 2 times, most recently from 7c4b402 to 0ba9363 Compare June 7, 2020 16:34
@phlogistonjohn
Copy link
Contributor Author

Can someone who is more familiar with these e2e tests help me out? I see that the test fails and times out waiting for a condition, but the output is not clear what condition. If it refers to the lines right above, I don't quite see how this pr would interact with that (node labeling).

I've (lightly) tested this by hand now for rbd and its working locally for me. I'd love a hand getting this PR to pass the tests.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jun 8, 2020

I0607 17:16:01.852839 1 controller.go:1047] Stop provisioning, removing PVC f5a20c29-2f5c-4841-94e2-1acf99920e7f from claims in progress
I0607 17:16:02.552694 1 reflector.go:188] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:135
E0607 17:16:02.557540 1 reflector.go:320] k8s.io/client-go/informers/factory.go:135: Failed to watch *v1.Node: unknown (get nodes)
I0607 17:16:03.557763 1 reflector.go:188] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:135
E0607 17:16:03.561644 1 reflector.go:320] k8s.io/client-go/informers/factory.go:135: Failed to watch *v1.Node: unknown (get nodes)
I0607 17:16:03.628958 1 leaderelection.go:283] successfully renewed lease cephcsi-e2e-10007/rbd-csi-ceph-com
I0607 17:16:04.561829 1 reflector.go:188] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:135
E0607 17:16:04.565708 1 reflector.go:320] k8s.io/client-go/informers/factory.go:135: Failed to watch *v1.Node: unknown (get nodes)
I0607 17:16:05.565856 1 reflector.go:188] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:135
E0607 17:16:05.569214 1 reflector.go:320] k8s.io/client-go/informers/factory.go:135: Failed to watch *v1.Node: unknown (get nodes)
I0607 17:16:06.569778 1 reflector.go:188] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:135
E0607 17:16:06.573637 1 reflector.go:320] k8s.io/client-go/informers/factory.go:135: Failed to watch *v1.Node: unknown (get nodes)
I0607 17:16:07.573814 1 reflector.go:188] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:135
E0607 17:16:07.577694 1 reflector.go:320] k8s.io/client-go/informers/factory.go:135: Failed to watch *v1.Node: unknown (get nodes)
I0607 17:16:08.577866 1 reflector.go:188] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:135
E0607 17:16:08.582257 1 reflector.go:320] k8s.io/client-go/informers/factory.go:135: Failed to watch *v1.Node: unknown (get nodes)

@ShyamsundarR are missing any RBAC for external-provisioner?

E0607 17:23:12.004248 1 omap.go:81] ID: 76 Req-ID: 0001-0024-76a1fce1-1be5-4a06-b6ef-4d380d09c422-0000000000000004-84a72b67-a8e3-11ea-befe-0242ac110013 failed removing omap keys (pool="newrbdpool", namespace="", name="csi.volumes.default"): rados: ret=2, No such file or directory

@phlogistonjohn this is the issue in cephcsi side, in DeleteVolume request, as CSI needs to be idempotent we need to consider the case as a success if some resource is not present in the backend

https://travis-ci.org/github/ceph/ceph-csi/jobs/695718476#L13231

@humblec
Copy link
Collaborator

humblec commented Jun 8, 2020

Looks like the rbd-external-provisioner also need #1142

@humblec
Copy link
Collaborator

humblec commented Jun 8, 2020

Looks like the rbd-external-provisioner also need #1142

hmm.. Its already present in the template...

@ShyamsundarR
Copy link
Contributor

I0607 17:16:08.577866 1 reflector.go:188] Listing and watching *v1.Node from k8s.io/client-go/informers/factory.go:135
E0607 17:16:08.582257 1 reflector.go:320] k8s.io/client-go/informers/factory.go:135: Failed to watch *v1.Node: unknown (get nodes)

@ShyamsundarR are missing any RBAC for external-provisioner?

In the job logs for non-helm based test, the above errors are not present.

Checked the helm scripts to ensure that topology was set to true, as that would add the required roles from the cluster role template.

Currently unable to figure out what else is missing, to cause the above errors to appear.

@ShyamsundarR
Copy link
Contributor

@phlogistonjohn this is the issue in cephcsi side, in DeleteVolume request, as CSI needs to be idempotent we need to consider the case as a success if some resource is not present in the backend

https://travis-ci.org/github/ceph/ceph-csi/jobs/695718476#L13231

Run: With helm
Last succesful step: ensuring created PV has its CSI journal in the CSI journal specific pool
Failure log (start):
Jun 7 17:23:11.183: INFO: Deleting PersistentVolumeClaim rbd-pvc on namespace rbd-7887
Jun 7 17:23:11.243: INFO: waiting for PVC rbd-pvc in state &PersistentVolumeClaimStatus{Phase:Bound,AccessModes:[ReadWriteOnce],Capacity:ResourceList{storage: {{1073741824 0} {} 1Gi BinarySI},},Conditions:[]PersistentVolumeClaimCondition{},} to be deleted (0 seconds elapsed)

Run: without helm
Last succesful step: ensuring created PV has its CSI journal in the CSI journal specific pool
Failure log (start):
Jun 7 17:21:25.525: INFO: Deleting PersistentVolumeClaim rbd-pvc on namespace rbd-7887
Jun 7 17:21:25.532: INFO: waiting for PVC rbd-pvc in state &PersistentVolumeClaimStatus{Phase:Bound,AccessModes:[ReadWriteOnce],Capacity:ResourceList{storage: {{1073741824 0} {} 1Gi BinarySI},},Conditions:[]PersistentVolumeClaimCondition{},} to be deleted (0 seconds elapsed)

NOTE: Debugging the non-helm setup, to avoid the "node get" noise.

  • First delete volume call to external provisioner (csi-provisioner container) (all repeat calls fail with the same error)
I0607 17:21:25.549479       1 controller.go:1422] delete "pvc-bbb67b74-9616-4a24-9f04-820b3f083760": started
I0607 17:21:25.552800       1 connection.go:182] GRPC call: /csi.v1.Controller/DeleteVolume
I0607 17:21:25.552817       1 connection.go:183] GRPC request: {"secrets":"***stripped***","volume_id":"0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011"}
I0607 17:21:26.064368       1 connection.go:185] GRPC response: {}
I0607 17:21:26.064951       1 connection.go:186] GRPC error: rpc error: code = Internal desc = rados: ret=2, No such file or directory
E0607 17:21:26.066023       1 controller.go:1445] delete "pvc-bbb67b74-9616-4a24-9f04-820b3f083760": volume deletion failed: rpc error: code = Internal desc = rados: ret=2, No such file or directory
  • First delete volume call to rbdplugin (csi-rbdplugin container) (all repeat calls fail with the same error)
I0607 17:21:25.554028       1 utils.go:159] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 GRPC call: /csi.v1.Controller/DeleteVolume
I0607 17:21:25.554047       1 utils.go:160] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 GRPC request: {"secrets":"***stripped***","volume_id":"0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011"}
I0607 17:21:25.555700       1 omap.go:58] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 got omap values: (pool="newrbdpool", namespace="", name="csi.volume.45be93df-a8e3-11ea-beb2-0242ac110011"): map[]
I0607 17:21:25.612436       1 controllerserver.go:453] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 deleting image csi-vol-45be93df-a8e3-11ea-beb2-0242ac110011
I0607 17:21:25.612459       1 rbd_util.go:245] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 rbd: status newrbdpool/csi-vol-45be93df-a8e3-11ea-beb2-0242ac110011 using mon rook-ceph-mon-a.rook-ceph.svc.cluster.local:6789
W0607 17:21:25.658571       1 rbd_util.go:267] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 rbd: no watchers on newrbdpool/csi-vol-45be93df-a8e3-11ea-beb2-0242ac110011
I0607 17:21:25.658606       1 rbd_util.go:314] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 rbd: rm newrbdpool/csi-vol-45be93df-a8e3-11ea-beb2-0242ac110011 using mon rook-ceph-mon-a.rook-ceph.svc.cluster.local:6789
E0607 17:21:26.063828       1 omap.go:81] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 failed removing omap keys (pool="newrbdpool", namespace="", name="csi.volumes.default"): rados: ret=2, No such file or directory
E0607 17:21:26.063932       1 voljournal.go:417] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 failed removing oMap key csi.volume. (rados: ret=2, No such file or directory)
E0607 17:21:26.063968       1 controllerserver.go:461] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 failed to remove reservation for volume () with backing image (csi-vol-45be93df-a8e3-11ea-beb2-0242ac110011) (rados: ret=2, No such file or directory)
E0607 17:21:26.064032       1 utils.go:163] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 GRPC error: rpc error: code = Internal desc = rados: ret=2, No such file or directory

First error is here: E0607 17:21:26.063828 1 omap.go:81] ID: 69 Req-ID: 0001-0024-f5a1f6b9-2059-496b-8deb-909001a8ebcc-0000000000000004-45be93df-a8e3-11ea-beb2-0242ac110011 failed removing omap keys (pool="newrbdpool", namespace="", name="csi.volumes.default"): rados: ret=2, No such file or directory

@phlogistonjohn Have to recheck the code but, newrbdpool holds the omap for the image, and the CSI omap should be in the default pool (replicapool) created. The code seems to be attempting a delete of the omap in the wrong pool?

@phlogistonjohn
Copy link
Contributor Author

Thanks for the feedback everyone, it appears to be that the rados command line did not return an error in the case of the omap not existings when removing keys. I've added a change to mimic that behavior. This should be ready for review now.

@phlogistonjohn phlogistonjohn marked this pull request as ready for review June 9, 2020 12:33
@phlogistonjohn phlogistonjohn changed the title [WIP][DNM] Convert journal package omap to use go ceph Convert journal package omap to use go ceph Jun 9, 2020
@phlogistonjohn
Copy link
Contributor Author

@nixpanic @humblec @ShyamsundarR @Madhu-1 PTAL. I have rebased again and now all the checks are happy now (previously, the CI had passed but mergify was stuck).

switch err {
case nil:
case rados.ErrNotFound:
klog.Errorf(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we log which key not found? it would be useful for debugging

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can log the keys requested, but logging keys-not-found doesn't make sense for this case because it is only returned if the oid is not found. The only reason it returns ErrKeyNotFound is for backwards compat. with the previous implementation.
So I'm not sure it would really be helpful.

internal/journal/omap.go Outdated Show resolved Hide resolved
internal/journal/omap.go Show resolved Hide resolved
internal/journal/voljournal.go Outdated Show resolved Hide resolved
internal/journal/voljournal.go Outdated Show resolved Hide resolved
internal/util/errors.go Show resolved Hide resolved
@@ -173,6 +177,7 @@ func NewCSISnapshotJournal(suffix string) *Config {
cephSnapSourceKey: "csi.source",
namespace: "",
encryptKMSKey: "csi.volume.encryptKMS",
commonPrefix: "csi.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this cause any breakage once user upgrades cephcsi, just want to make sure everything works in an upgrade scenario?
the upgrade is not covered in E2E :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be fine. It's just a constant that "rolls up" the current practice of all constants that start with "csi." (and I checked, they all do AFAICT).
If this were to change then we'd have to modify it with either additional constants or something, but I don't see a problem here.

internal/journal/omap.go Show resolved Hide resolved
@dillaman
Copy link

As a further (future) optimization, you could eliminate the journal lock and use a RADOS class call to execute the cmpomp.cmp_set_vals method which would allow you to atomically reserve the UUID and then, combined w/ a second omap setvals call, store all the values with a single round-trip to the OSD.

Copy link
Collaborator

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small nit, good to go

internal/journal/omap.go Outdated Show resolved Hide resolved
internal/journal/omap.go Outdated Show resolved Hide resolved
@mergify mergify bot dismissed nixpanic’s stale review June 17, 2020 17:21

Pull request has been modified.

@phlogistonjohn
Copy link
Contributor Author

Travis failed in setup (not my code). If someone doesn't poke it I'll push a dummy change later (nothing to rebase on atm).

@phlogistonjohn phlogistonjohn force-pushed the jjm-journal-omap-go-ceph branch 3 times, most recently from 4633f09 to e41662f Compare June 18, 2020 14:01
@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jun 19, 2020

/test ci/centos/containerized-tests

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jun 19, 2020

@Mergifyio rebase

These types have private fields but we need to construct them outside of
the util package. Add New* methods for both.

Signed-off-by: John Mulligan <[email protected]>
These new omap manipulation functions (get/set/remove) are roughly
equivalent to the previous command-line based approach but rely
on direct api calls to ceph.

Signed-off-by: John Mulligan <[email protected]>
Convert the business-logic of the journal to use the new go-ceph based
omap manipulation functions.

Signed-off-by: John Mulligan <[email protected]>
Taking this appraoch means that any function that must get more than one
key's value from the same oid can be more efficient by calling out to
ceph only once.

To be cautious and avoid missing things we always request ceph return
more keys than we actually expect to be set on the oid. If there are
unexpected keys there, we will not miss the keys we want if we first hit
an unexpected key if we were to limit ourselves to iterating only over
the number of keys we're expecting to be on the object.

Signed-off-by: John Mulligan <[email protected]>
For any function that removes more than one key on a single oid removing
them as a batch will be more efficient.

Signed-off-by: John Mulligan <[email protected]>
For any function that sets more than one key on a single oid setting
them as a batch will be more efficient.

Signed-off-by: John Mulligan <[email protected]>
The previous function used to remove omap keys apparently did not
return errors when removing omap keys from a missing omap (oid).
Mimic that behavior when using the api.

Signed-off-by: John Mulligan <[email protected]>
A number of exported functions in errors.go were missing doc comments.
Add them.

Signed-off-by: John Mulligan <[email protected]>
@mergify
Copy link
Contributor

mergify bot commented Jun 19, 2020

Command rebase: success

Branch has been successfully rebased

@phlogistonjohn
Copy link
Contributor Author

Travis status appears borked (again) - all tasks report success but travis ci failed to report back to the PR and it still shows pending to me.

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Jun 22, 2020

Travis status appears borked (again) - all tasks report success but travis ci failed to report back to the PR and it still shows pending to me.

Yeah, restarted it again

@mergify mergify bot merged commit 75088aa into ceph:master Jun 22, 2020
@phlogistonjohn phlogistonjohn deleted the jjm-journal-omap-go-ceph branch June 22, 2020 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/util Utility functions shared between CephFS and RBD
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fetch OMap keys in bulk that one by one
6 participants