-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make VM extension reconcile async and move VMSS extension into scaleset service #2177
Conversation
/assign @CecileRobertMichon |
d51ffa9
to
b489ad5
Compare
b489ad5
to
179e48f
Compare
f1281fe
to
9891e80
Compare
Settings: nil, | ||
ProtectedSettings: s.ProtectedSettings, | ||
}, | ||
// TODO: should we include location since it's used in VMExtensions too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why one has location and not the other, is location required for VM extensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in VMSS created the parameters w/o a location like this while the VM extensions included it. I kept the original behavior since it's a refactor but I wasn't sure if it was a bug.
extensions[i] = compute.VirtualMachineScaleSetExtension{
Name: &extensionSpec.Name,
VirtualMachineScaleSetExtensionProperties: &compute.VirtualMachineScaleSetExtensionProperties{
Publisher: to.StringPtr(extensionSpec.Publisher),
Type: to.StringPtr(extensionSpec.Name),
TypeHandlerVersion: to.StringPtr(extensionSpec.Version),
Settings: nil,
ProtectedSettings: extensionSpec.ProtectedSettings,
},
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure either. You could try 1) removing Location from VM extensions 2) adding Location to VMSS extensions, and see if either breaks?
I agree it doesn't seem right that they aren't consistent, but it's possible that this is due to the Azure API between VM and VMSS not being consistent (if it's not, let's make it consistent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I can give that a try. Are VM/VMSS extensions covered in the e2e tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are VM/VMSS extensions covered in the e2e tests?
yes they are, since we create an extension for every single VM and every single VMSS to check kubeadm completion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from your other comment sounds like there is a discrepancy in the API and we can just get rid of this comment?
|
||
// VMSSExtensionSpec defines the specification for a VM or VMScaleSet extension. | ||
type VMSSExtensionSpec struct { | ||
Name string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of duplicating fields here and in VMExtensionSpec (since they are the same) consider nesting the Azure extension spec in both
type VMSSExtensionSpec struct {
azure.ExtensionSpec
}
azure/types.go
Outdated
// ExtensionSpec defines the specification for a VM or VMScaleSet extension. | ||
type ExtensionSpec struct { | ||
// BootstrapingExtensionSpec defines the specification for a CAPZ Bootstrapping VM or VMSS extension. | ||
type BootstrapingExtensionSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the rename? the fields here could be reused later on for any other type of extension, nothing in here is bootstrapping specific AFAIK so I think the generic naming is appropriate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed it cause that struct is populated only by the GetBootstrappingVMExtension()
function here and I thought it would be more clear where it's coming from. Additionally, I wanted to make it clear that that type isn't coming from the services but from types.go and defaults.go. But if we want to use it more generically then we can change it back.
cluster-api-provider-azure/azure/defaults.go
Line 361 in 71a01a4
func GetBootstrappingVMExtension(osType string, cloud string, vmName string) *ExtensionSpec { |
azure/types.go
Outdated
@@ -96,7 +96,7 @@ type PrivateDNSLinkSpec struct { | |||
LinkName string | |||
} | |||
|
|||
// ExtensionSpec defines the specification for a VM or VMScaleSet extension. | |||
// ExtensionSpec defines the specification for a CAPZ Bootstrapping VM or VMSS extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: revert the comment about CAPZ Bootstrapping (VMSS change is good to keep)
azure/services/vmextensions/spec.go
Outdated
|
||
// VMExtensionSpec defines the specification for a VM or VMScaleSet extension. | ||
type VMExtensionSpec struct { | ||
Spec azure.ExtensionSpec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could just be:
Spec azure.ExtensionSpec | |
azure.ExtensionSpec |
so you can refer to fields directly, for example VMExtensionSpec.Publisher
instead of VMExtensionSpec.Spec.Publisher
@CecileRobertMichon It looks like the reason we don't have a location in |
6dac4e3
to
05653ab
Compare
// check the extension status and set the associated conditions. | ||
if retErr := s.Scope.SetBootstrapConditions(ctx, to.String(existing.ProvisioningState), extensionSpec.Name); retErr != nil { | ||
if retErr := s.Scope.SetBootstrapConditions(ctx, to.String(vmextension.ProvisioningState), extensionSpec.ResourceName()); retErr != nil { | ||
// TODO: what precedence should this error have? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetBootstrapConditions
might need a bit of refactoring. Instead of taking the error and using that to set the condition like we're doing for other resources, it's taking the provisioning state of the extension and using that to set the condition, then returning an error if the extension is done creating. Can we change it to always run (even if resultErr is not nil) and use the error to determine what the condition should be set to instead? Similar to what we're doing in UpdatePutStatus
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking good, there are a few outstanding TODOs that need to be resolved before this can merge and looks like you need to rebase due to a conflict.
e309bbb
to
5d34df0
Compare
5d34df0
to
0f35de5
Compare
s.Scope.DeleteLongRunningOperationState(s.Scope.ScaleSetSpec().Name, serviceName) | ||
// This also means that the VMSS extensions were successfully installed | ||
s.Scope.UpdatePutStatus(infrav1.BootstrapSucceededCondition, serviceName, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CecileRobertMichon I went ahead and added the UpdatePutStatus() call in the scaleset service. However, I'm not sure how we want to set the error message for VMSSExtensions. Do we want to just set it with whatever error comes from s.createVMSS(ctx)
, s.patchVMSSIfNeeded(ctx, fetchedVMSS)
, and s.getVirtualMachineScaleSetIfDone(ctx, future)
? Or should we try to set specific errors caused by the VMSSExtension (and not necessarily the scaleset as a whole)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question... looks like right now we're not setting conditions at all for the scale set itself because it hasn't yet been refactored (it was implemented as async before the async proposal landed so it's not fully following the same patterns). I would say let's leave this for now and add a note to handle both VMSS and VMSS extension conditions when an error occurs.
@@ -43,7 +43,7 @@ const ( | |||
// WaitingForBootstrapDataReason used when machine is waiting for bootstrap data to be ready before proceeding. | |||
WaitingForBootstrapDataReason = "WaitingForBootstrapData" | |||
// BootstrapSucceededCondition reports the result of the execution of the boostrap data on the machine. | |||
BootstrapSucceededCondition = "BoostrapSucceeded" | |||
BootstrapSucceededCondition clusterv1.ConditionType = "BootstrapSucceeded" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
s.Scope.DeleteLongRunningOperationState(s.Scope.ScaleSetSpec().Name, serviceName) | ||
// This also means that the VMSS extensions were successfully installed | ||
s.Scope.UpdatePutStatus(infrav1.BootstrapSucceededCondition, serviceName, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good question... looks like right now we're not setting conditions at all for the scale set itself because it hasn't yet been refactored (it was implemented as async before the async proposal landed so it's not fully following the same patterns). I would say let's leave this for now and add a note to handle both VMSS and VMSS extension conditions when an error occurs.
Settings: nil, | ||
ProtectedSettings: s.ProtectedSettings, | ||
}, | ||
// TODO: should we include location since it's used in VMExtensions too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the TODO since you've determined location is not included in the API?
build is failing with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/assign @mboersma
Thanks @Jont828! |
faaed62
to
bf7e23e
Compare
bf7e23e
to
d7ad4a8
Compare
@Jont828: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@CecileRobertMichon Squashed and pushed, should be good to merge! |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it: Implementation of an async service for VM extensions as a follow up for #1610 and #1541.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #2141
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: