-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve IBMPowerVSCluster deletion #1825
Improve IBMPowerVSCluster deletion #1825
Conversation
✅ Deploy Preview for kubernetes-sigs-cluster-api-ibmcloud ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@@ -311,7 +311,8 @@ func (r *IBMPowerVSClusterReconciler) reconcileDelete(ctx context.Context, clust | |||
|
|||
clusterScope.Info("Deleting DHCP server") | |||
if err := clusterScope.DeleteDHCPServer(); err != nil { | |||
allErrs = append(allErrs, errors.Wrapf(err, "failed to delete DHCP server")) | |||
clusterScope.Error(err, "failed to delete DHCP server") | |||
return reconcile.Result{RequeueAfter: 30 * time.Second}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dharaneeshvrd, if the DHCP server deletion constantly fails, won't this keep requeuing the request and not proceed with PowerVS service instance deletion causing cluster deletion to be stuck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, currently also same thing will happen with this block https://github.com/kubernetes-sigs/cluster-api-provider-ibmcloud/blob/main/cloud/scope/powervs_cluster.go#L2543toL2554
We need to delete before proceeding to the powervs instance right? or we can try deleting powervs instance after few retries anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, currently also same thing will happen with this block https://github.com/kubernetes-sigs/cluster-api-provider-ibmcloud/blob/main/cloud/scope/powervs_cluster.go#L2543toL2554
Yes, currently it's doing the same. I'm trying to understand how this change will help
we can try deleting powervs instance after few retries anyway?
right, I feel we need to proceed with PowerVS service instance deletion despite failure with DHCP, wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not tested the scenario but in case of DHCP failure, will the service instance deletion also get stuck?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes if the dhcp server did not get deleted, I don't think powervs instance will get deleted successfully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, Thanks. @Amulyam24 was there any particular reason for explicitly deleting DHCP server in CAPI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the initial flow before adding proper state management of resources, we added the DHCP deletion and check to make sure PowerVS workspace deletion succeeds. We can consider removing it now and proceed with workspace deletion and retry until it is succeeds. We need to test it thoroughly to make sure it is cleaned up. @dharaneeshvrd can you please try this out and let us know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure will try and see!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried many times in different region, the deletion went through without explicitly deleting the DHCP server.
Modified the code accordingly, please check!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, Thanks for verifying.
/cc @Karthik-K-N |
644791b
to
0730fa7
Compare
cloud/scope/powervs_cluster.go
Outdated
@@ -2492,32 +2490,6 @@ func (s *PowerVSClusterScope) deleteTransitGatewayConnections(tg *tgapiv1.Transi | |||
return requeue, nil | |||
} | |||
|
|||
// DeleteDHCPServer deletes DHCP server. | |||
func (s *PowerVSClusterScope) DeleteDHCPServer() error { | |||
if !s.isResourceCreatedByController(infrav1beta2.ResourceTypeDHCPServer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think there will be scenario where user provides existing workspace and expects the DHCP server to be created? If so in that case we need to delete only DHCP server.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, we need to handle that. Will do the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed this, ptal!
06c5ad5
to
c795492
Compare
c795492
to
aa1c6fc
Compare
cloud/scope/powervs_cluster.go
Outdated
@@ -2498,6 +2498,10 @@ func (s *PowerVSClusterScope) DeleteDHCPServer() error { | |||
s.Info("Skipping DHP server deletion as resource is not created by controller") | |||
return nil | |||
} | |||
if s.isResourceCreatedByController(infrav1beta2.ResourceTypeServiceInstance) { | |||
s.Info("Skipping DHCP server deletion as PowerVS service instance is created by controller, will directly delete the PowerVS service instance") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think user may not know PowerVS service instance delete will also delete DHCP server and may get confused after seeing word Skipping, I think its better to either use V(3) or update the log message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the log, let's have this log IMO, since its skipping a major action in delete flow.
cloud/scope/powervs_cluster.go
Outdated
ID: subnet.ID, | ||
}) | ||
|
||
if err != nil { | ||
if strings.Contains(err.Error(), string(VPCSubnetNotFound)) { | ||
if resp.StatusCode == ResourceNotFoundCode { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So is it safe to assume that resp wont be nil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems there is a possibility for it to be nil, added a nil check.
2b7a3fd
to
dbe9618
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit, otherwise LGTM. Thanks for verifying the deletion with DHCP server!
cloud/scope/types.go
Outdated
|
||
// TransitGatewayNotFound is the error returned when a transit gateway is not found. | ||
TransitGatewayNotFound = ResourceNotFound("gateway was not found") | ||
// ResourceNotFoundCode indicates the http status code used to denote that the resource not exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about simplifying as?
// ResourceNotFoundCode indicates the http status code used to denote that the resource not exists. | |
// ResourceNotFoundCode indicates the http status code when a resource does not exist. |
Skip DHCP server deletion when PowerVS service instance is created by controller Since PowerVS service instance deletion will take care of deleting the DHCP server as well
dbe9618
to
726be6b
Compare
@Karthik-K-N @Amulyam24 addressed the comments, please take a look! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Amulyam24, dharaneeshvrd, Karthik-K-N, mkumatag The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1805
Special notes for your reviewer:
/area provider/ibmcloud
Release note: