-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix AzureMachineTemplate roleAssignmentName validation #2672
Fix AzureMachineTemplate roleAssignmentName validation #2672
Conversation
|
Welcome @majimenez-stratio! |
Hi @majimenez-stratio. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
@@ -56,11 +56,31 @@ func (r *AzureMachineTemplate) ValidateCreate(ctx context.Context, obj runtime.O | |||
t := obj.(*AzureMachineTemplate) | |||
spec := t.Spec.Template.Spec | |||
|
|||
allErrs := ValidateAzureMachineSpec(spec) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we replacing the existing pattern? Is it because we don't want to validate system-assigned identity similarly for AzureMachine
and AzureMachineTemplate
resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not that familiar with the code, but it seems so since in AzureMachineTemplate
, an empty value for roleAssignmentName
is enforced. However, the validator for AzureMachine
is doing exactly the opposite since it tries to parse it as an UUID in:
if _, err := uuid.Parse(newIdentity); err != nil { |
An empty string is not valid, and as a result, the validation for
AzureMachineTemplate
can't pass if system-assigned identity is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the validation for AzureMachineTemplate can't pass if system-assigned identity is enabled.
Can you please expand on why that's the case? The roleAssignmentName
should be set to "" for templates as it's a GUID value that needs to be unique (so it can't be shared between all machines of a template). CAPZ will generate it for you on each AzureMachine so you shouldn't need to set it.
To enable system-assigned identity, simply set identity
to SystemAssigned
on the template, no need to set a roleAssignmentName.
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
name: ${CLUSTER_NAME}-md-0
namespace: default
spec:
template:
spec:
identity: SystemAssigned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll try to walk through the case step by step by analyzing the current code (without this PR).
- As you point out, for
AzureMachineTemplate
, theroleAssignmentName
should be left empty as it should be generated for eachAzureMachine
. In fact, it's validated as such here:
if spec.RoleAssignmentName != "" { | |
allErrs = append(allErrs, | |
field.Invalid(field.NewPath("AzureMachineTemplate", "spec", "template", "spec", "roleAssignmentName"), t, AzureMachineTemplateRoleAssignmentNameMsg), | |
) | |
} |
- In addition to that, a call is made to validate the template spec itself, using the validation code for
AzureMachine
:
allErrs := ValidateAzureMachineSpec(spec) |
- That call, in turn, calls
ValidateSystemAssignedIdentity
that tries to parse theroleAssignmentName
inside the template spec, and if left empty, the validation will fail here:
if _, err := uuid.Parse(newIdentity); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add more information on the issue, given this example of AzureMachineTemplate
with SystemAssigned
identity and not specifying roleAssignmentName
:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AzureMachineTemplate
metadata:
name: testma-control-plane
namespace: default
spec:
template:
spec:
identity: SystemAssigned
dataDisks:
- diskSizeGB: 256
lun: 0
nameSuffix: etcddisk
osDisk:
diskSizeGB: 128
osType: Linux
vmSize: Standard_D2s_v5
This is the error I got when applying it to the management cluster:
Error from server (Invalid): error when creating "testma.yaml": admission webhook "validation.azuremachinetemplate.infrastructure.cluster.x-k8s.io" denied the request: AzureMachineTemplate.infrastructure.cluster.x-k8s.io "testma-control-plane" is invalid: roleAssignmentName: Invalid value: "": Role assignment name must be a valid GUID. It is optional and will be auto-generated when not specified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood, thank you for providing more details. I'm looking at #2111 which last changed this logic to see if this is a regression or if it never worked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay so it looks like this is not a recent regression, the commit above has the same issue. I was able to reproduce the validation error on my end.
I'm not a huge fan of stopping to use ValidateAzureMachineSpec
for validation altogether. The main issue of forking is that we risk forgetting to add new validations in both places as we now have duplicated code.
There are a few alternatives we could go with to fix the issue:
- don't make
ValidateSystemAssignedIdentity
fail when roleAssignmentName is""
. Pro: this is a quick and easy fix. Con: AzureMachines are required to have aroleAssignmentName
when identity isSystemAssigned
so this would no longer catch those issues. It's mitigated by the fact that our defaulting webhook always defaultsroleAssignmentName
when it's empty and identity type isSystemAssigned
. - extract
ValidateSystemAssignedIdentity
fromValidateAzureMachineSpec
add it separately to AzureMachines'ValidateCreate
, keep the logic inazuremachinetemplate_webhook.go
the same. This is equivalent to the change you are making now but reversed (opt-out the func that doesn't apply instead of explicit opt-in all the others)
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand it correctly I vote for #2. Let's optimize for code maintenance instead of webhook order-of-precedence tricks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I opted for this approach due to me not being familiar with the code, as it looked like the safest solution. However, I see why you don’t like it and I agree with either of your suggestions. I personally would take no 2 since the code will be easier to understand since it’s not dependent on the defaulter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a new commit implementing approach no 2. Any suggestions?
/hold |
1b6d14c
to
e022e4c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/hold cancel
lgtm Could we add a unit test that validates the scenario this is fixing in https://github.com/kubernetes-sigs/cluster-api-provider-azure/blob/e022e4c2a50c97a7121e4054e8c2cb96a122b825/api/v1beta1/azuremachinetemplate_webhook_test.go? Thanks! |
New test added. Let me know if there's something else to do. |
Looks great, thank you @majimenez-stratio! Can you please just squash the commits? Thanks for finding and fixing this. I'm going to queue it up for cherry-pick. /cherry-pick release-1.4 |
@CecileRobertMichon: once the present PR merges, I will cherry-pick it on top of release-1.4 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
5990a23
to
311bf50
Compare
Done! |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@CecileRobertMichon: new pull request created: #2690 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@CecileRobertMichon: new pull request created: #2691 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Validation of roleAssignmentName is broken for AzureMachineTemplate when using SystemAssigned identity. Validation fails if empty and it shouldn't.
Special notes for your reviewer:
N/A
TODOs:
Release note: