Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BackendPolicy: allow global/namespace level defaults #583

Closed
howardjohn opened this issue Mar 11, 2021 · 11 comments
Closed

BackendPolicy: allow global/namespace level defaults #583

howardjohn opened this issue Mar 11, 2021 · 11 comments
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@howardjohn
Copy link
Contributor

Currently, BackendPolicy refers to an explicit set of backends. It has been a common demand for us to configure things like retries and timeouts (which, presumably, will be part of BackendPolicy in the future) at a higher level. For example, if a user wants to enable retries on all requests globally, they don't want to do this in 1000s of BackendPolicies and instead want a single point of control.

The request here would be some way to apply BackendPolicy to a large scope. This could be global or namespace level.

Some possible options. I don't think any of these are fully refined:

  • This is done in GatewayClass. This would mean we either need to copy the BackendPolicy spec into some other resource, or reference a BackendPolicy which has some dummy backendRef, or reference a BackendPolicy which has an empty backendRef (not allowed today)
  • This is done natively in BackendPolicy. Straw man: BackendPolicy without backendRef applies to full namespace. For global ones we could have some mapping from BackendPolicy to Gateway/GatewayClass such that it applies to anything configured by that gateway/class
@robscott
Copy link
Member

Thanks for starting this discussion @howardjohn!

As far as creative uses of BackendPolicy, I don't think there's much right now that would prevent this from happening (there's no min length on BackendRefs).

I think the options you've mentioned above could be roughly split into these 3 options:

  1. Use GatewayClass parameters to configure default parameters
    This comes with the advantage of not needing any API changes. It also would mean that all configuration for a class is accomplished in one place. Unfortunately, this might not be great for namespace-scoped implementations of this API that may not be able to configure cluster-scoped resources.

  2. BackendPolicy without backendRef applies to full namespace.
    I believe this is actually possible without any API changes since we have intentionally left off any minLength validation on backendRefs. I think this could have unintended consequences. As I recall, the rational for allowing backendRefs to be empty is that admins may want to leave some standard/common backend policies in place, and users would then be able to simply adjust backendRefs on existing policies as desired. I'm not sure how common this would actually be, but I think a pattern of "referring to nothing means your referring to everything" could be confusing.

  3. BackendPolicy could refer to a Gateway or GatewayClass.
    I like this idea. @mark-church has been putting in a lot of thought around how we attach policy to resources, and I think this could be a nice way to solve both goals. If attaching policy to a Gateway can also be used to provide defaults that can be overridden at lower levels, we could have a pretty compelling solution to both Gateway policy + defaulting.

So far we've said that only a single BackendPolicy can apply to a request. This would likely need to change if we added the ability to provide defaults at additional levels unless the policy resources we were attaching were something other than BackendPolicy.

To really have a solid answer here, it would be helpful to start filling out BackendPolicy with more content to see how broadly it applied. My hunch is that we'd want to attach different kinds of policy at different levels. If we end up with sufficiently generic policy, I think 3 becomes very compelling.

@bowei
Copy link
Contributor

bowei commented Mar 12, 2021

A thing to note is that we could leave some of this to be implementation specific for the moment while collecting experience across implementations.

Nothing prevents implementations from giving unattached BackendPolicy (and friends) resources special meaning if defined in specific namespaces.

@robscott
Copy link
Member

Yeah I agree that this is a good time for experimentation. All of the options above are completely possible today without an API change. It would be good to see what works or doesn't work in practice before we lock down the spec too tightly.

@howardjohn
Copy link
Contributor Author

Oh neat, I didn't realize backendRef allowed 0 length list. That seems reasonable for now. I do worry a little bit if we have implementations experimenting a lot now we may end up with lots of divergence that is never reconciled - maybe this can be addressed by good conformance tests post-alpha/beta

@mark-church
Copy link
Contributor

mark-church commented Apr 21, 2021

@howardjohn I've been thinking a lot about how to apply Namespace-level default or enforcement with the ServicePolicy proposal in #611. At first I thought that Namespace-level policy was incompatible with ServicePolicy because ServicePolicy depends on a strict hierarchy between resources to have predictable and enforceable override behavior. However, I realize that we can simply add Namespace into this hierarchy and voila!

As a part of Gateway API (and in the definition of ServicePolicy), we would specify a strict hierarchy of resources:

  • Namespace
  • GatewayClass
  • Gateway
  • Route
  • Service

image

Here's what's going on in this diagram:

  • The Namespace is now the top of the policy hierarchy. It can also reference a ServicePolicy resource. Because Namespaces are outside the scope of the Gateway API resources, it is referring to a ServicePolicy through standardized annotations.
  • Because service-policy-override=false it means that maxConnections=100 is an enforced policy rather than a default policy. Because the namespace is at the top of the hierarchy, the trafficPolicy.connectionPool.tcp.maxConnections field cannot be overridden by any other resources.
  • The Gateway, Route, and Service resources have empty policy lists. This just means that they are not applying any policy. policyOverride=true is the implicit default so these resources allow any policy values to be configured by southbound resources that are not already enforced northbound.
  • The GatewayClass resource references a ServicePolicy to set retries.attempts and retries.timeouts. Since policyOverride=true this is just a default which can be overridden by any of the southbound resources.

Now it may seem a little strange that Namespace is at the top of the hierarchy when GatewayClass is a non-Namespaced resource, but I think this model still works out. Check out this diagram:

image

  • Each GatewayClass has defaults configured for it. Since the GatewayClasses are non-Namespaced, these defaults will apply no matter where the Gateways are deployed, as long as the Namespace doesn't override the GatewayClass default policies.
  • In this case, the Namespace does reference a ServicePolicy (through annotations) which overrides the GatewayClass default values. The Namespace ServicePolicy only applies for the Gateway API resources that are deployed within that Namespace.
  • The Gateways also have their own policies (policy-3 and policy-4) which does not conflict with anything northbound, so these policies will apply to all of the southbound resources (as long as those resources don't override these policy fields).

Let me know what you think about this and whether you feel it addresses the "global defaults and enforced policies" problem.

cc @youngnick @stevesloka @thockin

@howardjohn
Copy link
Contributor Author

I think it makes sense, but I am not sure I understand why Namespace is above GatewayClass? That means the cluster admin cannot enforce any policies, since namespace admins can override?

@mark-church
Copy link
Contributor

We could choose to rearrange it so that the hierarchy has GatewayClass -> Namespace -> Gateway. I think there are good arguments for both orders. Ultimately, my guess is that the same administrator would have ownership over GatewayClasses and Namespaces, so they could choose either vector to enforce policies. It all depends on whether enforcing via a GatewayClass or Namespace has simpler UX for enforcing org-wide policies. I personally think that Namespace is a little more natural (and has more precedence) which is why I started with Namespace at the top of the hierarchy.

The Namespace policy doesn't actually directly affect the GatewayClass. The Namespace policy only takes effect when a given Gateway is deployed into a Namespace. There is definitely some weirdness though when Gateways and Routes are in different Namespaces. You would have a Namespace policy that applies to a Gateway in that Namespace which also applies to Routes which are bound from different Namespaces. We'd have to think through how this would work and whether a local Namespace policy would override the Gateway's Namespace policy.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 20, 2021
@robscott
Copy link
Member

I think this can be closed once #713 is implemented.

@hbagdi
Copy link
Contributor

hbagdi commented Jul 22, 2021

I think so too.
/close
Please re-open if needed.

@k8s-ci-robot
Copy link
Contributor

@hbagdi: Closing this issue.

In response to this:

I think so too.
/close
Please re-open if needed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

7 participants