-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added support for system node pools in managedClusters #1475
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @LochanRn. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@@ -42,6 +42,13 @@ spec: | |||
description: AzureManagedMachinePoolSpec defines the desired state of | |||
AzureManagedMachinePool. | |||
properties: | |||
mode: | |||
description: 'Mode - represents mode of an agent pool. Possible values | |||
include: ''System'', ''User''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the quotes came out a bit weird here. maybe just drop the single quotes around the values themselves? or can we escape them better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
r.Spec.OSDiskSizeGB, | ||
"field is immutable, unsetting is not allowed")) | ||
} else if *r.Spec.OSDiskSizeGB != *old.Spec.OSDiskSizeGB { | ||
// changing the field is not allowed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think os disk size is mutable as long as you use managed disks, but I could be mistaken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will check and get back on this.
opt1 := client.InNamespace(r.Namespace) | ||
opt2 := client.MatchingLabels(map[string]string{ | ||
clusterv1.ClusterLabelName: clusterName, | ||
LabelAgentPoolMode: SystemNodePool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is clever, but I'm curious how it works for a cluster in transition? Might need to add the label inside the reconciler for ammp created before this change went in to ensure they get it applied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very clear what the doubt is here.
The label is emeded into the ammp by the mutating webhook so i guess that should suffice and not require us to add labels in the reconciler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMMP created before this commit and not updated after will not hit the webhook. They will be reconciled. For the webhook to work correctly, they all need to have the label. So it likely should be added inside reconciler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess there were no system pools before this commit, so that’s not really possible 😅 it’s fine as is, you are right
/ok-to-test |
|
||
isSystemNodePool := ammp.Spec.Mode == infrav1exp.SystemNodePool | ||
|
||
if groupMatches && kindMatches && isSystemNodePool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm...I'm not 100% sure I follow the thinking here. when this was the default pool, we reconciled it as part of the managedcluster itself (actually the managed service still tries to do this for all pools, maybe a TODO there). we'd actually double reconcile it I think, once as part of MC and once as part of AP.
if we modified the last system pool however, we don't want to reconcile the cluster, we want to reconcile just that pool. if we accidentally race and try to delete the last system pool before creating a new one, just let the retry happen on 400 (return err from reconcile)
we can probably also make the managedclusters client avoid patching agentpools after create, to cleanly separate that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MachinePoolToAzureManagedControlPlaneMapFunc was added by you in #1397
This is the description, reasoning the PR has.
If the MachinePool owning the defaultPool of an AzureManagedControlPlane is nil when we try to reconcile, we need to re-reconcile. We can either return an error or watch machinePools and filter them down to the one owning the defaultPool (more accurate, but requires watching and filtering all machinePool events). I implemented the latter.
So this is used in AMCP and not in MC.
I found your reasoning fine hence i thought of working on the same lines for system node pool, i.e if the owner mp of system node pool is nil we try to reconcile as it is more accurate.
But I guess I made a mistake in line 265 in azuremanagedcontrolplane_reconciler I should return a nil over there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see. This makes sense, but then we should definitely make sure AMCP reconciler doesn't send all the AMMPs to AKS on updates, only update the control plane config. We don't want AMCP and AMMP reconcilers to compete to reconcile node pools except that we let AMCP create the very first one as part of the cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah makes sense, i think we have a check for this in line 216 azuremanagedcontrolplane.go. In amcp we reconcile ammp only during creation.
// Add to cluster spec | ||
managedClusterSpec.AgentPools = []managedclusters.PoolSpec{defaultPoolSpec} | ||
managedClusterSpec.AgentPools = ammps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is what we agreed to in #1376, WDYT about picking a random system pool instead and letting the AMMP reconciler parallelize creation of all the individual pools? I believe this would be faster since we need control plane + 1 pool, and then reconcile all user + system pools in parallel via AKS Agentpool API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct that was the thought process.
overall approach looks good, few small comments. I still need to try this out locally. looks like generated manifests are out of date and e2e need a few updates |
ammps = append(ammps, ammp) | ||
} | ||
if len(ammps) == 0 { | ||
return errors.New("owner ref for system machine pools not ready") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should return nil instead of error.
} | ||
|
||
if err := webhookClient.Get(ctx, key, ownerCluster); err != nil { | ||
if !azure.ResourceNotFound(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be azure.ResourceNotFound(err)
/test pull-cluster-api-provider-azure-e2e-windows |
/test pull-cluster-api-provider-azure-e2e |
1 similar comment
/test pull-cluster-api-provider-azure-e2e |
You can follow the instructions at https://www.kubernetes.dev/docs/guide/github-workflow/#squash-commits to rebase and squash |
@@ -115,7 +115,7 @@ variables: | |||
intervals: | |||
default/wait-controllers: ["3m", "10s"] | |||
default/wait-cluster: ["20m", "10s"] | |||
default/wait-control-plane: ["20m", "10s"] | |||
default/wait-control-plane: ["35m", "10s"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to change this value for every test, was it to address #1481 ? if so we should fix it separately and/or add a specific AKS timeout interval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have noticed several times for an AKS cluster to come up fully it takes a max of 30m I kept 5m buffer. 20m is not enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
were there any other changes to make e2e pass? I noticed deletion and the full test sometimes takes more than 20min, but the cluster initialization should really take less time. it's fine if we want to debug it elsewhere, but let's be cognizant of that.
I'd probably prefer adding a separate interval for AKS if we need to increase it so high, like cecile mentioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nvm, answered my own question in the other issue/pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let’s revert the change here, I’ll fix the timeout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, i did not make any other changes. Yes it gets initialised quickly but for the first managedmachinepool or first system managed machine pool to come to a ready state it takes time. Like about 25-30 mins approx. I have not exactly timed the deletion but from the boot to until the first system pool comes up it would be approx 30 mins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I debugged this extensively in #1488, there's a lag in Azure List VMSS which causes AMMP reconciliation to take way longer than expected. Both pools and the cluster should generally be done in ~10 minutes. I have a workaround and I'm also chasing it down internally with the owning team.
There's also a small issue with watches where if we reconcile AMMP with a stale AMCP in the cache, we won't requeue the 2nd machinepool for a long time (because our mapper only maps to system pools).
Thanks |
@@ -19,7 +19,7 @@ package agentpools | |||
import ( | |||
"context" | |||
|
|||
"github.com/Azure/azure-sdk-for-go/services/containerservice/mgmt/2020-02-01/containerservice" | |||
"github.com/Azure/azure-sdk-for-go/services/containerservice/mgmt/2021-03-01/containerservice" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any chance I can persuade you to upgrade to 2021-05-01
😃 the diff should be basically nothing. if there's any non-trivial diff, you can leave it at 2021-03-01
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The one minor difference while moving it from 2021-03-01 - 2021-05-01 is
ListClusterAdminCredentials api has an extra param serverFqdn in 2021-05-01. I have kept it to empty string.
creating an AzureManagedControlPlane requires defining the default | ||
machine pool, since AKS requires at least one system pool at creation | ||
time. | ||
creating atleast one AzureManagedMachinePool with Spec.Mode System, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creating atleast one AzureManagedMachinePool with Spec.Mode System, | |
creating at least one AzureManagedMachinePool with Spec.Mode System, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack 👍
@@ -23,6 +23,10 @@ import ( | |||
|
|||
// AzureManagedMachinePoolSpec defines the desired state of AzureManagedMachinePool. | |||
type AzureManagedMachinePoolSpec struct { | |||
// Mode - represents mode of an agent pool. Possible values include: System, User. | |||
// +kubebuilder:validation:Enum=System;User | |||
Mode string `json:"mode"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we default this? or require it? i'm fine as-is but thought I'd raise it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay with anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we decided on this ? Should I add a default instead of required ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing from required => optional is non-breaking, let’s leave it required for now 👍
@@ -21,8 +21,21 @@ import ( | |||
capierrors "sigs.k8s.io/cluster-api/errors" | |||
) | |||
|
|||
const ( | |||
// LabelAgentPoolMode represents mode of an agent pool. Possible values include: System, User. | |||
LabelAgentPoolMode = "azurecluster.infrastructure.cluster.x-k8s.io/agentpoolmode" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"azurecluster" ? maybe "azuremanagedmachinepool"? or at least azuremanagedcontrolplane?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change it to azuremanagedmachinepool.
LabelAgentPoolMode = "azurecluster.infrastructure.cluster.x-k8s.io/agentpoolmode" | ||
|
||
// SystemNodePool represents mode system for azuremachinepool. | ||
SystemNodePool = "System" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can probably just use the SDK values AgentPoolMode{System,User}, or else I'd prefer naming slightly closer to that (SystemNodePool is a bit too generic imo and could easily be a var name used elsewhere)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I think keeping the label definition here is better..
maybe Label{System,User}NodePool? not super picky on this one tbh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be fine ?
type NodePool string
const (
// AgentPoolModeSystem represents mode system for azuremachinepool.
AgentPoolModeSystem NodePool = "System"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d probably
type NodePoolMode string
const (
NodePoolModeSystem …
NodePoolModeUser…
)
or something to align the names a bit more closely
Cluster: cluster, | ||
ControlPlane: azureControlPlane, | ||
MachinePool: ownerPool, | ||
InfraMachinePool: defaultPool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove InfraMachinePool from ManagedControlPlaneScope now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you've done it here, but I think it can be removed from the struct definitions as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did remove it initially, but then had to add it back as the ammp controller also uses ManagedControlPlaneScope. The InfraMachinePool is required there.
65d786b
to
be9196b
Compare
@LochanRn: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@LochanRn: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
unable to rebase correctly, closing this PR. Will open a fresh PR. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
system node pool will be of type system and not user.
added mode in the azuremannagedmachinepool api to set the mode of azure agent pools
removed defaultPoolRef from azuremanagedcontrolplane api. (provides more flexibility to manage system node pools.)
validation for deletion of last system node pool is handled in the webhook.
bumped up the containerservice sdk from 2020-02-01 to 2021-03-01
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #1376 #1416
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: