-
Notifications
You must be signed in to change notification settings - Fork 431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add oidcIssuerProfile to AzureManagedControlPlane #3973
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #3973 +/- ##
==========================================
+ Coverage 56.26% 56.80% +0.54%
==========================================
Files 190 188 -2
Lines 19447 19242 -205
==========================================
- Hits 10941 10931 -10
+ Misses 7877 7677 -200
- Partials 629 634 +5
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice to have that! I hope it can be included in the incoming release.
/milestone v1.11 @maciaszczykm yes that's the plan |
type OIDCIssuerProfileStatus struct { | ||
// Enabled is whether the OIDC issuer is enabled. | ||
// +optional | ||
Enabled *bool `json:"enabled,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is interesting
It differs a bit from existing patterns we have for setting "read-only" fields (eg. kubelet identity). This is arguably better and more aligned with k8s pattern (spec is desired state and status is read-only actual state) but inconsistent with other fields we already have. I'm not sure about duplicating the enabled field in both spec and status, that strikes me as redundant.
What was the reasoning in including the status field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this was done in the spirit of #3855. ASO's ManagedCluster type is also structured the same way. One concrete case that having enabled
in the status helps with is the webhook validation to return an error when a user tries to disable the OIDC issuer, which is not allowed by AKS.
The field itself is passed through verbatim from CAPZ to the AKS API, where nil
means "don't change" and non-nil true
and false
set it explicitly. That means when enabled
is nil in the AzureManagedControlPlane spec, that by itself is not enough to know whether or not the OIDC issuer is actually enabled at that moment. Since AKS (always, AFAICT) populates enabled
in its API when the user sends nil
, propagating that to the status in AzureManagedControlPlane lets the webhook definitively know the actual state of enabled
when validating an update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of letting the user set the spec
field to "nil"? Would the same enforcement be covered by a simpler immutable check (once it's set to true, cannot be unset or set to false)?
overall I'm +1 on making the IssuerURL
part of status but I don't love that we're designing a new pattern for enabled/disabled and immutability on the fly that is inconsistent with other API fields which currently have the exact same behavior from an AKS perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing nil
means users that have already set this on their clusters via other means won't have this change out from under them when CAPZ upgrades. I believe @mtougeron expressed this kind of concern a while ago around the autoscaler settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a valid reason for doing it this way. However, if we're going to support that scenario ("users that have already set this on their clusters via other means") we should do this consistently across all fields we support (that's what ASO does) https://github.com/Azure/azure-service-operator/blob/main/v2/api/containerservice/v1api20230201/managed_cluster_types_gen.go#L10167
Would be worth doing a quick audit to see how many fields are not following this pattern currently but could be, and then having an issue to convert them to use the same pattern. It might be the case that we end up wrapping around the ASO types before we get to it but would still be good to understand where the gaps are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CecileRobertMichon Are you overall more +1 to leaving this field as a plain non-pointer bool in the spec and letting it default to false in all cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And if spec.oidcIssuerProfile
is null in the spec, should that always imply spec.oidcIssuerProfile.enabled
is false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nojnhuh, no the danger with using a plain non-pointer is that the default could change and then we wouldn't know if a user is intentionally disabling it or not setting the field.
To summarize there are two options/patterns possible:
A) The spec field is an optional pointer bool and defaults the value to a sane default when nil. We don't support users setting things outside of the spec, the CAPZ spec is the only source of truth for the object. This is what we do for most fields if not all currently.
B) The spec field is an optional pointer bool, we don't default the value and let "nil" be a valid value to pass to the AKS API. We use a status field to track the "actual" value in Azure. This allows users to set this value outside of the CAPZ spec and not get overwritten, as long as AKS doesn't default/require a value. This is what this PR is doing currently, and what ASO does.
IMO both are valid. I tend to prefer consistency overall because that is less likely to lead users to make mistakes. This is not a hill I'm willing to die on (and I really want to see this PR get merged so I don't want to be standing in the way of that) but my personal preference would be to keep A) for now for consistency with the agreement that B) is what we're trying to evolve towards with a v2 of the API that wraps around ASO types (#3629).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my personal preference would be to keep A) for now for consistency with the agreement that B) is what we're trying to evolve towards with a v2 of the API that wraps around ASO types
Agree this seems like the best path for now. I'll update this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CecileRobertMichon This should work the way you described now. Can we still get this in for the v1.11 release?
Updated this so it should be more consistent with other fields. /hold for squash |
test/e2e/azure_test.go
Outdated
@@ -839,15 +839,6 @@ var _ = Describe("Workload cluster creation", func() { | |||
} | |||
}) | |||
}) | |||
|
|||
By("modifying OIDC issuer configuration", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't keep the "actual" state in the status, then we have to wait to add this test back until we can use SDK v2 clients in the tests.
/assign @CecileRobertMichon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
LGTM label has been added. Git tree hash: e85602b848fd98ff2bf0047d57d54757826077e0
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
func (m *AzureManagedControlPlane) setDefaultOIDCIssuerProfile() { | ||
if m.Spec.OIDCIssuerProfile == nil { | ||
m.Spec.OIDCIssuerProfile = &OIDCIssuerProfile{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this thread covers this: #3973 (review)
But, just to clarify, the reason we're doing it this way instead of the way that autoscaler does it above (if user doesn't declare a profile config we omit it from the spec entirely) is because this more explicit "is disabled" implementation removes edge cases from upgrade scenarios?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackfrancis this comment summarizes it #3973 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think because autoscaler is a little different from other fields in that we want to send nil
over the wire to AKS when not defined in the CAPZ spec. OIDC issuer is more like other fields where we always send a non-nil value.
squashed! |
/lgtm |
#3970 |
@nojnhuh: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
What type of PR is this?
/kind feature
/area managedclusters
What this PR does / why we need it:
This PR exposes the AKS
oidcIssuerProfile
config from AzureManagedControlPlane atspec.oidcIssuerProfile
. It also addsstatus.oidcIssuerProfile
which includes the current actual state ofenabled
and theissuerURL
.Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #2498
Special notes for your reviewer:
TODOs:
Release note: