Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support validating admission webhooks #1217

Closed
raffaelespazzoli opened this issue Mar 17, 2019 · 29 comments
Closed

Support validating admission webhooks #1217

raffaelespazzoli opened this issue Mar 17, 2019 · 29 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. kubebuilder-integration Relates to rewriting the SDK in Kubebuilder plugin form lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Milestone

Comments

@raffaelespazzoli
Copy link
Contributor

raffaelespazzoli commented Mar 17, 2019

Feature Request

Some CRs need validation that goes beyond the syntactic validation provided by the openapi fields.
Please add a facility to the operator-sdk to manage this situation. From a developer perspective a stub validation method should be generated and the developer needs to fill it up. If the CR does not pass validation then an error (or false) is returned. Requesting the generation the generation of this method should be optional for the operator-sdk generate controller command

Is your feature request related to a problem? Please describe.
As I said some CRs require semantic validation, as of now every developer of an operator has to come up with its own solution.

Describe the solution you'd like
I believe there are multiple ways to implement this, but perhaps the most kubernetes-pure would be via a ValidationAdmissionController. The generated operator could provide a http endpoint to listen for these ValidationAdmissionController webhook calls. Regardless the user experience should be the one defined above.

@joelanford joelanford added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 20, 2019
@hasbro17
Copy link
Contributor

This has been discussed on prior issues(#1038 #269 ) before but we can consolidate the feature request to this issue.

Validation should be done via the controller-runtime's webhook server and webhook handler APIs.
Examples:
https://github.com/kubernetes-sigs/controller-runtime/blob/5e3a6da770e2f16f8ce9b0905eaa74974fe7a4c0/examples/builtins/main.go#L79-L88
https://github.com/kubernetes-sigs/controller-runtime/blob/5e3a6da770e2f16f8ce9b0905eaa74974fe7a4c0/examples/builtins/validatingwebhook.go

On the SDK side we just need to scaffold the handlers, document the workflow and have some examples to make the user experience easier.

@hasbro17 hasbro17 changed the title add support for semantic validation of the CR Support validating admission webhooks Mar 25, 2019
@secat
Copy link
Contributor

secat commented Apr 16, 2019

As a reference implementation the kubebuilder SDK has scaffolding for generating mutating/validating webhooks based on the controller-runtime's webhook server and webhook handler APIs.

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 15, 2019
@hasbro17
Copy link
Contributor

/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 16, 2019
@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 14, 2019
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci-robot openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 13, 2019
@JAORMX
Copy link
Contributor

JAORMX commented Dec 14, 2019

Is scheduled to be done at some point? Would sure be great! 😄

@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@hasbro17
Copy link
Contributor

/lifecycle frozen

@openshift-ci-robot openshift-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jan 14, 2020
@hasbro17
Copy link
Contributor

@JAORMX Yes this is still on the roadmap, hopefully soon.
We've held off on adding this to the SDK separately since we're envisioning this will come as part of the overall kubebuilder integration.
https://github.com/operator-framework/operator-sdk/blob/master/doc/proposals/openshift-4.4/kubebuilder-integration.md

@Jamstah
Copy link
Contributor

Jamstah commented Jan 20, 2020

New link for above: https://github.com/operator-framework/operator-sdk/blob/master/doc/proposals/kubebuilder-integration.md

@wreed4
Copy link

wreed4 commented Feb 9, 2020

The process seems to have changed drastically between controller-runtime v0.2.0 and v0.4.0. I can't seem to figure out how to implement this with v0.4.0 even with the above links. Are there any examples, or docs anyone can link to for the new patterns?

Specifically, I can't seem to have the cert generated like it was before.

@rcernich
Copy link

I don't understand how integrating with kubebuilder will help in this regard. controller-runtime v0.1 provided support for managing all aspects of the webhook. Those are gone in v0.2+, leaving it to the operator author to implement. kubebuilder doesn't seem to help, as it simply generates resources that need to be installed (i.e. kubectl apply). This could be generated from within the operator deployment easily enough, but...

The real problem is certificate management. There is no way to create/manage the certificate/key used by the webhook endpoint and the webhook configuration. It would be nice to use the service-ca functionality to create/inject the cert/key into both the operator deployment and the webhook's caBundle field, but because secret creation is triggered from an annotation on the service, and there's no way to define a service in a csv file, this won't work. (Also, caBundle injection for webhooks doesn't appear to be available from service-ca.)

Maybe this is more an issue for OLM, but I think the csv spec should be expanded to allow for the creation of additional resources that might be needed by an operator, in this case, a service, a secret, and a webhook configuration. Until then, I think anybody writing an operator that provides a webhook will have to stick with operator-sdk <= v0.10.1. (Please correct me if this is wrong, I'd love to upgrade.)

@robbie-demuth
Copy link

First off, I want to stress that all of the desired functionality isn't available out of the box, but integrating with Kubebuilder definitely makes this possible. Our organization is leveraging webhooks using v0.15.2. What we basically did was scaffold out a new project using Kubebuilder to the point where it bootstrapped the <TYPE>_webhook.go file. We then copied this file into our Operator SDK project and registered the webhooks in cmd/manager/main.go. As for the Kubernetes manifests, we're using controller-gen and Kustomize to generate the MutatingWebhookConfiguration and ValidatingWebhookConfiguration objects. As for certificate management, we're using cert-manager. I'd share the code if I could, but, unfortunately, it's a private project. I'm happy to provide more details here though for those that have specific questions

@rcernich
Copy link

@robbie-demuth, thanks for the advice. What I was trying to say was that functionality is lost when moving past v0.1 of controller-runtime, and there is no comparable solution available, which becomes a real problem if you're relying on OLM for installation and you don't have access to something like cert-manager. Kubebuilder and scaffolding don't really help with this problem. (It's easy enough to create a webhook.yaml and add the handlers into the controller. It's even very easy to migrate the old handlers into the newer api.)

FWIW, my plan is to register the webhook from within the operator during startup and use a separate reconciler to keep caBundle up to date with the cert (cert-manager is not an option, but service-ca in OpenShift will at least create the cert/key secret, which can be mounted on the operator pod). To make this easy from an OLM perspective, being able to define the admission controller service and the webhook in the CSV file would help with the install bits (secret won't be available until after the service is created), and if OpenShift's service-ca service were able to inject the cert into the webhook's caBundle field, none of this would be necessary (i.e. all the controller would have to do is provide the functionality).

Thanks again for the advice.

@JAORMX
Copy link
Contributor

JAORMX commented Feb 22, 2020

@robbie-demuth, thanks for the advice. What I was trying to say was that functionality is lost when moving past v0.1 of controller-runtime, and there is no comparable solution available, which becomes a real problem if you're relying on OLM for installation and you don't have access to something like cert-manager. Kubebuilder and scaffolding don't really help with this problem. (It's easy enough to create a webhook.yaml and add the handlers into the controller. It's even very easy to migrate the old handlers into the newer api.)

FWIW, my plan is to register the webhook from within the operator during startup and use a separate reconciler to keep caBundle up to date with the cert (cert-manager is not an option, but service-ca in OpenShift will at least create the cert/key secret, which can be mounted on the operator pod). To make this easy from an OLM perspective, being able to define the admission controller service and the webhook in the CSV file would help with the install bits (secret won't be available until after the service is created), and if OpenShift's service-ca service were able to inject the cert into the webhook's caBundle field, none of this would be necessary (i.e. all the controller would have to do is provide the functionality).

Thanks again for the advice.

Actually, as of 4.4 openshift's service-ca does support setting certificates for webhook's 😄

@tengqm
Copy link
Contributor

tengqm commented May 17, 2020

@estroz
Copy link
Member

estroz commented May 20, 2020

@tengqm thanks for updating this issue. operator-sdk create webhook is now available, but only for Go projects created with operator-sdk init (or those that have migrated to that layout). Since init is not the default and we do not have migration documentation written yet, I will not close this yet.

@rcernich
Copy link

Is the project layout documented, e.g. so I can simply add the resource into my project, without using sdk cli?

@estroz
Copy link
Member

estroz commented May 20, 2020

@rcernich the full project layout is documented throughout the kubebuilder book. The section on scaffolding webhooks and where they live is here. Setup for new project layouts is similar to that for current ones, although you'll have to invoke controller-gen directly.

@rcernich
Copy link

Yeah, I guess I was just looking for something a little simpler, "put your webhook.yaml in this folder." We already have a controller that we've been carrying along since 0.10. FWIW, because we started so early, we haven't really been using the sdk cli to manage resources.

@sjpotter
Copy link

sjpotter commented Jun 8, 2020

I feel there's a fundamental disconnect between operators and the current webhook api that kubernetes has.

i.e. many many operators are written to be namespaced (i.e. only operate on resources in the namespace) and this is behavior that is seemingly encouraged by OLM. However webhooks are all cluster wide today, i.e .not namespace restricted. i.e. I can't "easily/reliably" setup a webhook to only hit my namespace's admission controller.

Yes, there are things I can do.

  • I can setup a namespace selector. I don't consider this relaible for something as integral as admission control.

  • I can setup the admission control to return "allow" for every resource that is sent to it that is not in its namespace (ugly, puts a much larger load on apiserver and increases places where failures can happen)

I think webhook configuration should be posisble to be namespaced and only resources/verbs that match within that webhooks's configuration within the webhook's configuration namespace will be fired.

thoughts?

I'm working on a "webhook" proxy operator (i.e. with a namespaced webhook crd) to do this with the hope that it would motivate changes to kubernetes (though would need this proxy operator for backwards compatibility even then)

do people have thoughts on this? disagree with my assessment? think there are better ways to do it?

@rcernich
Copy link

rcernich commented Jun 8, 2020

Ideally, I think this would be managed by the sdk and/or olm itself. OLM manages the installation based on operator groups and watched namespaces, so it seems like OLM would be the one to configure namespace labels that should be associated with a specific operator installation.

@sjpotter
Copy link

sjpotter commented Jun 8, 2020

that's still very specific to openshift, doesn't help operators on vanilla kubernetes or in other environments

@estroz
Copy link
Member

estroz commented Jun 9, 2020

@sjpotter while OLM is enabled in OpenShift clusters from the get-go, it is easily installed onto a vanilla k8s cluster. There's even a minikube plugin (minikube addons enable olm). Additionally you can install OLM with

$ operator-sdk olm install

Regardless what you're describing is a general k8s grievance unrelated to this particular issue. It is worth posting about this upstream if you haven't already. Can you post a link to your proxy operator, for my own curiosity?

@sjpotter
Copy link

sjpotter commented Jun 9, 2020

literally just started writing it this week, there isn't much there yet. expect it to be a month or so before I having a working prototype with CRDs and all. basically sprints are

  1. proof of concept of a static proxy.
  2. add CRDs to make more dynamic

this has been motivated out of the our work @ redislabs on our operator and how we see people using it in the field (i.e. want to minimize admin involvement, don't want to give cluster roles as want everything namespaced).

@estroz
Copy link
Member

estroz commented Jun 9, 2020

@sjpotter awesome, feel free to post it when you're happy with your prototype (and it's OSS)!

@sjpotter
Copy link

sjpotter commented Jun 9, 2020

the plan is that it should be, just getting an ok to share the short design doc.

@camilamacedo86
Copy link
Contributor

After 0.19 SDK release, the new projects created with the tool are integrated with Kubebuilder and support webhooks. Ideally, we should have a subject per issue. Otherwise starts to be hard to follow up. So, I am closing this one as addressed after 0.19 release, however, please feel free to raise new issues with specific questions or feature requests related to the feature provide as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. kubebuilder-integration Relates to rewriting the SDK in Kubebuilder plugin form lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests