-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cancel rolling upgrades before deleting VMSS & extensions #18973
Changes from all commits
5a324aa
d3400ef
b9862b9
2ddd4ed
9912820
44c498b
58d807d
1bacd30
5a39c53
38d86b6
1f9b0fa
985cc04
affd5eb
6c2def3
6d1953a
772448f
68dd089
14ebf71
8ef20ab
3516879
876a8da
6c4090f
8e7ddd3
e14adc0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,29 @@ | ||||||||||||||||||||||
package client | ||||||||||||||||||||||
|
||||||||||||||||||||||
import ( | ||||||||||||||||||||||
"context" | ||||||||||||||||||||||
"fmt" | ||||||||||||||||||||||
"log" | ||||||||||||||||||||||
"net/http" | ||||||||||||||||||||||
"strings" | ||||||||||||||||||||||
) | ||||||||||||||||||||||
|
||||||||||||||||||||||
func (c *Client) CancelRollingUpgradesBeforeDeletion(ctx context.Context, resourceGroupName string, vmScaleSetName string) error { | ||||||||||||||||||||||
future, err := c.VMScaleSetRollingUpgradesClient.Cancel(ctx, resourceGroupName, vmScaleSetName) | ||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is an API call to retrieve the latest rolling upgrades, I think we should do that here and return nil if no rolling upgrade is fine, i.e. there is no need to cancel and trigger the 409 error described down below, and if there is an existing rolling upgrade then we proceed to cancel - all other errors should be caught and surfaced
Suggested change
|
||||||||||||||||||||||
|
||||||||||||||||||||||
// If rolling upgrades haven't been run (when VMSS are just provisioned with rolling upgrades but no extensions, auto-scaling are run ) | ||||||||||||||||||||||
// we can not cancel rolling upgrades | ||||||||||||||||||||||
// API call :: GET https://management.azure.com/subscriptions/{subId}/resourceGroups/{rgName}/providers/Microsoft.Compute/virtualMachineScaleSets/{vmSSName}/rollingUpgrades/latest?api-version=2021-07-01 | ||||||||||||||||||||||
// Azure API throws 409 conflict error saying "The entity was not found in this Azure location." | ||||||||||||||||||||||
stephybun marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||||||||||||||||
// If the above error message matches, we identify and move forward to delete the VMSS | ||||||||||||||||||||||
// in all other cases, it just cancels the rolling upgrades and move ahead to delete the VMSS | ||||||||||||||||||||||
if err != nil && !(future.Response().StatusCode == http.StatusConflict && strings.Contains(err.Error(), "There is no ongoing Rolling Upgrade to cancel.")) { | ||||||||||||||||||||||
return fmt.Errorf("error cancelling rolling upgrades of Virtual Machine Scale Set %q (Resource Group %q): %+v", vmScaleSetName, resourceGroupName, err) | ||||||||||||||||||||||
} | ||||||||||||||||||||||
Comment on lines
+20
to
+22
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for the clarification. This actually resulted in a crash for me when when trying to reproduce the case described in the comments above. In addition matching on the error messages tends to be brittle since that is something that can change very easily in the API. A more robust way to check whether to call the cancel function for rolling upgrades would be to query whether there are any ongoing rolling upgrades first. I've left comments in-line with what that would look like.
Comment on lines
+14
to
+22
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Means none of this is necessary since we aren't hitting this behaviour anymore
Suggested change
|
||||||||||||||||||||||
|
||||||||||||||||||||||
if err := future.WaitForCompletionRef(ctx, c.VMScaleSetExtensionsClient.Client); err != nil && !(future.Response().StatusCode == http.StatusConflict && strings.Contains(err.Error(), "There is no ongoing Rolling Upgrade to cancel.")) { | ||||||||||||||||||||||
return fmt.Errorf("waiting for cancelling rolling upgrades of Virtual Machine Scale Set %q (Resource Group %q): %+v", vmScaleSetName, resourceGroupName, err) | ||||||||||||||||||||||
} | ||||||||||||||||||||||
log.Printf("[DEBUG] cancelled Virtual Machine Scale Set Rolling Upgrades %q (Resource Group %q).", vmScaleSetName, resourceGroupName) | ||||||||||||||||||||||
return nil | ||||||||||||||||||||||
Comment on lines
+24
to
+28
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And we can simplify the condition here since it should be prevented by the check retrieving any rolling upgrades above
Suggested change
|
||||||||||||||||||||||
} |
Original file line number | Diff line number | Diff line change | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -1078,6 +1078,13 @@ func resourceLinuxVirtualMachineScaleSetDelete(d *pluginsdk.ResourceData, meta i | |||||||||||||||
return fmt.Errorf("retrieving Linux Virtual Machine Scale Set %q (Resource Group %q): %+v", id.Name, id.ResourceGroup, err) | ||||||||||||||||
} | ||||||||||||||||
|
||||||||||||||||
// When rolling upgrades are setup, vmscalesets can't be deleted unless the upgrade is cancelled. | ||||||||||||||||
// Since destroy function intention is to VMSS itself. Rolling upgrades are trivial here, hence we cancel before we trigger destroy call | ||||||||||||||||
err = meta.(*clients.Client).Compute.CancelRollingUpgradesBeforeDeletion(ctx, id.ResourceGroup, id.Name) | ||||||||||||||||
if err != nil { | ||||||||||||||||
return fmt.Errorf("error while cancelling rolling upgrade during destroy phase in Linux Virtual Machine Scale Set %q (Resource Group %q) : %+v", id.Name, id.ResourceGroup, err) | ||||||||||||||||
} | ||||||||||||||||
Comment on lines
+1083
to
+1086
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's simplify this to
Suggested change
|
||||||||||||||||
|
||||||||||||||||
// Sometimes VMSS's aren't fully deleted when the `Delete` call returns - as such we'll try to scale the cluster | ||||||||||||||||
// to 0 nodes first, then delete the cluster - which should ensure there's no Network Interfaces kicking around | ||||||||||||||||
// and work around this Azure API bug: | ||||||||||||||||
|
Original file line number | Diff line number | Diff line change | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -361,6 +361,13 @@ func resourceVirtualMachineScaleSetExtensionDelete(d *pluginsdk.ResourceData, me | |||||||||||||||
return err | ||||||||||||||||
} | ||||||||||||||||
|
||||||||||||||||
// When rolling upgrades are setup, vmscalesets can't be deleted unless the upgrade is cancelled. | ||||||||||||||||
// Since destroy function intention is to VMSS itself. Rolling upgrades are trivial here, hence we cancel before we trigger destroy call | ||||||||||||||||
err = meta.(*clients.Client).Compute.CancelRollingUpgradesBeforeDeletion(ctx, id.ResourceGroup, id.VirtualMachineScaleSetName) | ||||||||||||||||
if err != nil { | ||||||||||||||||
return fmt.Errorf("error while cancelling rolling upgrade during destroy phase in Linux Virtual Machine Scale Set %q (Resource Group %q) : %+v", id.VirtualMachineScaleSetName, id.ResourceGroup, err) | ||||||||||||||||
} | ||||||||||||||||
Comment on lines
+366
to
+369
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here
Suggested change
|
||||||||||||||||
|
||||||||||||||||
future, err := client.Delete(ctx, id.ResourceGroup, id.VirtualMachineScaleSetName, id.ExtensionName) | ||||||||||||||||
if err != nil { | ||||||||||||||||
return fmt.Errorf("deleting Extension %q (Virtual Machine Scale Set %q / Resource Group %q): %+v", id.ExtensionName, id.VirtualMachineScaleSetName, id.ResourceGroup, err) | ||||||||||||||||
|
Original file line number | Diff line number | Diff line change | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -1108,6 +1108,13 @@ func resourceWindowsVirtualMachineScaleSetDelete(d *pluginsdk.ResourceData, meta | |||||||||||||||
return fmt.Errorf("retrieving Windows Virtual Machine Scale Set %q (Resource Group %q): %+v", id.Name, id.ResourceGroup, err) | ||||||||||||||||
} | ||||||||||||||||
|
||||||||||||||||
// When rolling upgrades are setup, vmscalesets can't be deleted unless the upgrade is cancelled. | ||||||||||||||||
// Since destroy function intention is to VMSS itself. Rolling upgrades are trivial here, hence we cancel before we trigger destroy call | ||||||||||||||||
err = meta.(*clients.Client).Compute.CancelRollingUpgradesBeforeDeletion(ctx, id.ResourceGroup, id.Name) | ||||||||||||||||
if err != nil { | ||||||||||||||||
return fmt.Errorf("error while cancelling rolling upgrade during destroy phase in Linux Virtual Machine Scale Set %q (Resource Group %q) : %+v", id.Name, id.ResourceGroup, err) | ||||||||||||||||
} | ||||||||||||||||
Comment on lines
+1113
to
+1116
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And here
Suggested change
|
||||||||||||||||
|
||||||||||||||||
// Sometimes VMSS's aren't fully deleted when the `Delete` call returns - as such we'll try to scale the cluster | ||||||||||||||||
// to 0 nodes first, then delete the cluster - which should ensure there's no Network Interfaces kicking around | ||||||||||||||||
// and work around this Azure API bug: | ||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just pass in the VMSS resource ID here since that contains all the info we need