-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛Cleanup ports #1063
🐛Cleanup ports #1063
Conversation
✔️ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready! 🔨 Explore the source changes: 7a56ed0 🔍 Inspect the deploy log: https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/61a621627d4a330007e7f0e4 😎 Browse the preview: https://deploy-preview-1063--kubernetes-sigs-cluster-api-openstack.netlify.app |
8fe7ad6
to
8bcb1fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The business looks way more clean 👍
I'd say mission accomplished 😃
/lgtm
Looks like some prow / gcloud error unrelated to this PR on a first view? |
Kubelet didn't come up correctly on the control plane node: I haven't yet worked out how to read kubelet's startup logs, so it's anybody's guess why. It's a flake, though. |
/test pull-cluster-api-provider-openstack-e2e-test |
All I can see is that the kubelet itself is not able to connect to the kube-apiserver to register himself as node. I also can't find any logs of the static pods :-( |
From CAPO controller logs:
I can't see any obvious reason for this in either the CAPO or the devstack logs. If I had to put my money somewhere, I would guess this is a neutron bug. Everything else seems to have been deleted AFAICT. |
/test pull-cluster-api-provider-openstack-e2e-test |
Same issue: failure to delete security group:
It's the same security group and the same test in both cases, so this is starting to look deterministic. Possibly related: I notice that several changes landed in neutron stable/victoria on 24th November, which is when this change seems to have started failing. |
It's a bug introduced by this patch! Consequently the failed create doesn't cleanup ports, and the security group is still in use. I've probably made this mistake systematically in a few places. I'll update. |
8bcb1fb
to
7fd6438
Compare
Fixed it 🎉 Still need to merge #1061 first, which is under review with @macaptain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other PR was merged.
One last idea: do we want to move the defer for deleting ports above the ports creation?
var server *ServerExt | ||
|
||
// Ensure we delete the ports we created if we haven't created the server. | ||
defer func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the defer maybe also be above line 184?
If creating multiple ports and a port fails we may want to cleanup the already created ones too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also add a unit test for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've squashed this and added a new unit test.
We were previously failing to handle several error return cases which would result in leaking ports. The defer() method of cleanup is idiomatic, will handle the existing missing cases, and also the volume cleanup case which is about to be added.
7fd6438
to
7a56ed0
Compare
/hold cancel |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hidekazuna, mdbooth The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Nice! /lgtm |
This PR is only the final commit in the series! All other commits are from #1061, which needs to merge first
What this PR does / why we need it:
This fixes a leak of ports when failing to fetch image or flavor for any reason, including user error, during instance creation.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1062
Special notes for your reviewer:
TODOs:
/hold until #1061 merges