Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce validations through CEL for Gateway and GatewayClass. #2226

Merged
merged 3 commits into from
Jul 27, 2023

Conversation

gauravkghildiyal
Copy link
Member

What type of PR is this?

/kind feature

What this PR does / why we need it:
Add support for CEL based validation for Gateway and GatewayClass. This allows us to have more complex validations than those supported by the OpenAPIv3 spec without the need of an additional validation webhook. Upcoming PRs will gradually extend this to HTTPRoutes (and other remaining resources) as well.

References:

Notes for the reviewers:

  • The tests included in this PR perform error string comparisons. I understand that this may not be the most ideal choice but it helps establish some confidence that the validations are in fact working correctly.
  • Tests aren't currently being run parallely for easier debugging when reading the test output. The tests itself barely take anytime so we may not have the need to run them parallely at the moment.

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

Introduce validations through CEL for Gateway and GatewayClass.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 24, 2023
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jul 24, 2023
@@ -78,6 +78,9 @@ spec:
minLength: 1
pattern: ^[a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*\/[A-Za-z0-9\/\-._~%!$&'()*+,;=:]+$
type: string
x-kubernetes-validations:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this stuff only on by default in k8s 1.25+? Is that the new min version, or is this silently ignored on old versions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gateway API already promises to support the trailing 5 k8s versions, so I believe we'd need to release a second set of CRDs for k8s <=1.24 as long as we support those versions. Based on the k8s release cycle, I think that's going to be at least another ~6 months.

It is annoying to maintain yet another set of CRDs, but on the other hand, I think a significant amount of usage is moving towards 1.25+, and there's a chance we may be able to entirely remove the webhook on those versions in favor of CEL, which would be a significant win.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could get away with saying v1.25 is the minimum supported K8s version for GA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we commit to double CRDs, can someone confirm its not gracefully handled on old versions? I assume not, but worth checking

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this up!

Yes so I did check things with 1.23 and I didn't find any problem. To be specific, I was able to create these CRDs which have this new x-kubernetes-validations field. It got ignored.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shaneutt at a minimum I'd like to make CEL the default installation method and then add a note like "if you're running Kubernetes 1.24 or older, installing the webhook is recommended as well".

@gauravkghildiyal that's great that we might be able to continue to use the same sets of CRDs for this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chatted with @jpbetz, it looks like CRDs with x-kubernetes-validations will go through starting in 1.23 thanks to kubernetes/kubernetes#108030. The actual validation will be limited to 1.25+ (unless clusters have had CEL manually enabled earlier).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that means that once 1.28 is out, we will be both mandating an minimum of 1.23 for the CRDs that include x-kubernetes-validations and still in compliance with our "5 versions" support policy.

Including another set of CRDs for older versions seems like it would be more work that it's worth then.

@shaneutt
Copy link
Member

(just a note that this has some implications on #1514)

@gauravkghildiyal gauravkghildiyal force-pushed the cel-validation branch 2 times, most recently from 714d560 to a1bd3fc Compare July 26, 2023 03:22
Copy link
Contributor

@youngnick youngnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some small changes to remove "should" in validation messages (to me, failing validation implies "must" :) )

apis/v1beta1/gateway_types.go Outdated Show resolved Hide resolved
apis/v1beta1/gateway_types.go Outdated Show resolved Hide resolved
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gauravkghildiyal!

/approve

Comment on lines 17 to 18
# TODO(gauravkghildiyal): Make this a "verify" script (so that it runs along
# with the verification presubmit) once we are confident about these tests.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gauravkghildiyal Can you make an issue to track this so we don't lose track of it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I created #2239 to track

@@ -0,0 +1,470 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tiny nit: I'm not really sure where this should live, but I'd rather avoid another top-level directory. I agree that we should probably keep it away from the other validation tests in /apis since we probably want to try to make /apis as targeted/focused as possible.

Some other options:

  1. /hack/cel-validation: Most of our presubmits/things resembling tests live in /hack right now, which doesn't feel perfect, but it would at least live alongside the rest of the presubmit logic.
  2. /pkg/test/cel: I think it's likely that we'll have additional go tests over time, this feels like it would be a fairly natural place for them to live.

At this point I think I'm leaning towards 2 here, but open to alternatives.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is a great point.

I see that the PR has been updated already with moving everything into /hack, but tbh I think that it's better if we put the Go code in /pkg/test/cel like @robscott suggested.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh definitely this should be changed. I was hoping to get feedback regarding this so thanks a lot for the suggestion.

I've choosen option /hack/cel-validation. (Because we have unit tests within pkg and it wasn't very neat since we run our unit tests using go test ./pkg/... which would then also run our e2e tests. This further suggests that perhaps that was not the right choice)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's also a good point - this is good enough for now, and can be changed later. Thanks.

Copy link
Member Author

@gauravkghildiyal gauravkghildiyal Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops I noticed our comments collided there :)
I kinda totally get why hack/cel-validation may not seem ideal, but yeah we could definitely change it later

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 26, 2023
Copy link
Member

@shaneutt shaneutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's great that we're adding this, appreciate the hard work on this.

I'm pretty much ready to say let's go, as things look generally good:

/approve

Just to be clear, we've manually tested this as well right? I know we've added some automated tests, but we've performed some manual deployments of CRDs and exercised things a bit on an implementation using those CRDs, e.t.c.?

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gauravkghildiyal, robscott, shaneutt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gauravkghildiyal
Copy link
Member Author

Just to be clear, we've manually tested this as well right? I know we've added some automated tests, but we've performed some manual deployments of CRDs and exercised things a bit on an implementation using those CRDs, e.t.c.?

That's a very valid concern and I completely share the sentiment. Yes I have manually tested it to an extent. But yes this could be critical and all the testing we could get here is welcomed. (I'll continue testing on the side whenever I can and quickly revert back if something seems off)

@youngnick
Copy link
Contributor

I'm happy to LGTM this, but I'll also hold so that we can decide if we want to merge now or after v0.8.0.

/lgtm
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 27, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 27, 2023
Copy link
Member

@robscott robscott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gauravkghildiyal! A couple more things and then I think we're good to merge this.

wantErrors: []string{"certificateRefs must be set and not empty when TLSModeType is Terminate"},
},
{
desc: "certificatedRefs set with tls protocol and TLS terminate mode",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
desc: "certificatedRefs set with tls protocol and TLS terminate mode",
desc: "certificateRefs set with tls protocol and TLS terminate mode",

Comment on lines +402 to +405
// TODO(gauravkghildiyal): Figure out a sensible way to check
// validity of IP addresses. Admission webhook uses golang IP
// parsing which may not be directly translateable to regex matching
// in CEL. At the moment, this value will not result in an error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should resolve this prior to merging + add more valid and invalid test cases around this. As far as I can tell, our only option here is regex, this one appears to work well, but open to alternatives: https://regex101.com/r/Kzijkp/1.

@robscott robscott mentioned this pull request Jul 27, 2023
@robscott
Copy link
Member

Chatted with @shaneutt and I think we're good to merge this. @gauravkghildiyal's pretty busy right now so I'm going to file a follow up PR shortly that addresses the last bits of feedback I have.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 27, 2023
@k8s-ci-robot k8s-ci-robot merged commit 3ae9fa7 into kubernetes-sigs:main Jul 27, 2023
@gauravkghildiyal gauravkghildiyal deleted the cel-validation branch July 30, 2023 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants