-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP: Pruning for CustomResources #2443
Conversation
054263b
to
8a2ffb3
Compare
4. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning, but enforce that every `properties` key inside `anyOf`, `allOf`, `oneOf`, `not` also appears outside all of those. | ||
|
||
From these options: | ||
1. leads to fields being kept from pruning although they only appear in branches of the OpenAPI validation schema which do not contribute to validation of an object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the one that I would expect from pruning. Any field mentioned anywhere is kept.
|
||
Even worse: forcing onto another branch might make more defaults applicable. In other words, we would need to run validation with defaulting until it converges. This is a hint that the semantics of defaulting in full OpenAPI schemata are not well defined either. | ||
|
||
**We propose to allow defaults only outside of `anyOf`, `allOf`, `oneOf`, `not`.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I remember being ok with this as a starting point which could later be expanded if we so desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a comment.
8a2ffb3
to
c8bf99a
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
b3358d0
to
2fcab11
Compare
> | ||
> Then `a`, `b`, `b.x`, `c`, `d` are specified, `b.y` and `e` are not specified. | ||
|
||
**Question:** is it natural semantics to assume `d` to be specified and not to prune it? Note that there is no object with `d` that validates against `"d":{not:{}}`. But due to the `anyOf` there are objects which will validate against the complete schema. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mehdy Bohlool: There should be a difference between not specifying "d" and specifying it like this. Not specifying it means pruning and define it like this means validation fail
Stefan Schimanski: OpenAPI is too powerful: you can specifiy it this way, still the validatio succeeds. That's basically our semantics problem: we have the three cases:
- not specified
- specified and validating
- specified and not validating (but the whole schema validates).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David Eads: I would expect d
to remain even if it is not used to satisfy validation. I expect d
to remain because it's potential presence is acknowledged and the fact that the subrule can never be satisfied isn't a level of introspection we should perform.
Antoine Pelisse: +1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allowing "d" to remain seems sufficient for the "security" use case. I don't understand if it is sufficient for the broader "consistency" use case mentioned in the in the overview.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My take on this is that we could go both routes: prune or not prune in these cases. @deads2k's intuition to keep d
is much simpler semantics-wise, i.e. you don't have to understand the schema deeply to know what to prune, i.e. this semantics is purely driven by syntax.
|
||
Motivated by the examples, we have different options to define the pruning semantics: | ||
1. Use the described semantics of `specified`. | ||
2. Only consider `properties` fields in the schema which actually successfully validate a given object (then `d` would be pruned in example 3). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David Eads: I'd prefer to not do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRD authors may add extra validation in an admission webhook. They may move checks between JSON schema and webhooks as is convenient to them.
With that in mind, to implement 2, would you call any validation webhooks during decoding? If so, does it get convoluted to do that from the decoding stage? If not, will it lead to changes in pruning behavior if a CRD author moves where validation for a field is done.
This line of thinking makes me think 2 is too complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point about the webhooks.
Motivated by the examples, we have different options to define the pruning semantics: | ||
1. Use the described semantics of `specified`. | ||
2. Only consider `properties` fields in the schema which actually successfully validate a given object (then `d` would be pruned in example 3). | ||
3. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Antoine Pelisse: I don't understand what this means. Does this mean pruning fields in the AllOf (for example), forcing it to fail validation?
Stefan Schimanski: No, it means that we only keep fields specified outside of any of anyOf, allOf, oneOf, not. So if you forget to specify a field outside, but only inside an anyOf, it will be pruned. Probably not what we want.
1. Use the described semantics of `specified`. | ||
2. Only consider `properties` fields in the schema which actually successfully validate a given object (then `d` would be pruned in example 3). | ||
3. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning. | ||
4. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning, but enforce that every `properties` key inside `anyOf`, `allOf`, `oneOf`, `not` also appears outside all of those. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Antoine Pelisse: Isn't that the same as the one before? Since if you drop the fields in oneOf, allOf, anyOf, they are always going to be invalid, so they have to also appear in properties
.
Stefan Schimanski: We would reject the CRD if the outer properties are not a super-set of the ones under all/any/oneOf/not.
So option 4 protects at least the user from mistakes.
Note: all these are not about validating a CR or not, but how to specifiy what you don't want to prune via CRD OpenAPI validation schemata.
An empty object `{}` would be defaulted to `{"a":{"x":42}}` | ||
|
||
|
||
Obviously, the default values should validate. But OpenAPI validation schemata do not enforce that. Also one default in a branch of `anyOf`, `allOf`, `oneOf` might force another validation run onto another branch of these quantors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About "default values should validate":
David Eads: this doesn't bother me
Stefan Schimanski: We do a second round of validation in the registry validation step. So we catch them.
About "might force another validation run onto another branch of these quantors
David Eads: This does bother me. cyclical?
Stefan Schimanski: it will converge as the number of branches is finite and defaulting is idem-potent. But it is a strange behaviour nevertheless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are defaults relevant if we pick option 1? Option 1 seems to only require examining property names, and not their values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We propose below to only allow defaults outside of anyOf
, allOf
, ... So we don't have to decide about values and which branches they fulfill for them, making application of defaults very simple as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it will converge as the number of branches is finite and defaulting is idem-potent. But it is a strange behaviour nevertheless.
I'm pretty sure that this is turing complete or close to it if you can engineer branches that drop fields. Which is more or less what this KEP is proposing.
I really think we need to look hard at the the schema and come up with some rules that prevent people from saying things that they can't say with go types. I have some thoughts on how to do this, but this KEP isn't the place to hash it out. We should NOT be able to get turing-complete behavior by iterated default application.
|
||
Even worse: forcing onto another branch might make more defaults applicable. In other words, we would need to run validation with defaulting until it converges. This is a hint that the semantics of defaulting in full OpenAPI schemata are not well defined either. | ||
|
||
**We propose to allow defaults only outside of `anyOf`, `allOf`, `oneOf`, `not`.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David Eads: I'm ok with this restriction to start, but I'm also ok with simply running multiple times. Eventually defaulting will stop defaulting.
Stefan Schimanski: also note that this convergence happens at some point. But there is no guarantee that it always converges against the same value (depending on which branches we take first) We should better avoid it altogether. IMO it's a strong hint for bad semantics.
Antoine Pelisse: I agree with Stefan. I'm not sold on the fact that defaults on properties (outside xxxOf/not) is completely solving that problem though?
Antoine Pelisse: To clarify my point, some structures might validate before defaults are applied, some might not validate after, so the order still matters.
Stefan Schimanski: The skeleton has no "require". So if it validates after defaulting, it was validating before as well. So this is fine.
Antoine Pelisse: I don't think that is true because of
"not". Validation can be OK because a value wasn't there (before being defaulted). {"properties": {"a": {"default": 42}}, "anyOf": [{"not": {"properties": {"a": {}}, "additionalProperties": false}}]}
Stefan Schimanski: You claim to have a counterexample of the inverse of my statement :-) Of course, defaulting can make an object invalid. But dropping constrains (including all negative constains (i.e. under not or oneOf)) makes a schema weaker. So the skeleton is weaker than the full schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with this, as users can always add defaults later in a webhook if they really need them for xxxOf blocks. (I question the need for not blocks, especially if openAPI is generated from an IDL)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree about not
. IDLs (like e.g. Golang types) are probably always strictly-positive. I also question the need for oneOf
for the same reason.
|
||
Even worse: forcing onto another branch might make more defaults applicable. In other words, we would need to run validation with defaulting until it converges. This is a hint that the semantics of defaulting in full OpenAPI schemata are not well defined either. | ||
|
||
**We propose to allow defaults only outside of `anyOf`, `allOf`, `oneOf`, `not`.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mehdy Bohlool: why we don't allow it but fail on confusion/conflicts. That means validating the schema before actually using it for defaulting (and even pruning) and while the schema can be a valid OpenAPI schema, we reject it because, for example, it has two defaults for a single value using anyOf.
Stefan Schimanski: the conflicts depend on inputs. Computing the generic overlap for all input is computationally hard and needs complete validation schemata. Structural restriction is far less complex and allow to have unspecified validatoin schemata.
>} | ||
>``` | ||
> | ||
> Take an object `{"a": "1", "b":"foo", "c":"bar"}`. Assuming we follow pruning option 1, it is pruned to `{"a": "1", "b":"foo", "c":"bar"}` and to default to `{"a":"1", "b": "foo", "c":"bar", "d": 1}`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
David Eads: I would not have expected to be pruned. I'd expect c
to be kept and d=1
(first rule in what I described).
I'm also ok with restricting this for a first impl.
Antoine Pelisse: how would this work with additionalProperties: false
?
Stefan Schimanski: We forbid that additionProperties and properties are used in parallel for the same object/map. So additionalProperties: false
means to prune everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k added this as a comment further down that this is the starting point with the options for relaxing this later if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to engineer our schema-acceptance criteria so we never face this question, I think.
> | ||
> Take an object `{"a": "1", "b":"foo", "c":"bar"}`. Assuming we follow pruning option 1, it is pruned to `{"a": "1", "b":"foo", "c":"bar"}` and to default to `{"a":"1", "b": "foo", "c":"bar", "d": 1}`. | ||
> | ||
> But what about the object `{"a": "12", "b":"foo", "c":"bar"}`? Both patterns apply. With pruning option 1 we keep `"b"` and `"c"`. But what about the default? Should we default to `1` or `2`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mehdy Bohlool: this is another example think can be easily detected and we can reject the schema because of confusion/conflicts in it. It should be clear from the schema what we are pruning and what we are defaulting. We can even try to make a list of allowed field with their default from schema (I believe that's proposal number 1) and if that process failed, we reject the schema.
Stefan Schimanski: I don't think we can easily detect overlaps. E.g. we would have to intersect regular languages (which is possible, but has high complexity).
>} | ||
>``` | ||
> | ||
> The type and properties are usually called schema validation, while regex patterns and formats are about values. The latter would be tested in the validation phase, the former would be checked during decoding for native types in the JSON decoder. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mehdy Bohlool: I am not sure about format. it usually uses as an extra identifier with type and may not actually validate the value. Though the border here is blurry. When we say the type is string and the format is date, is it considered a validation or it is the "date" type? My understanding was "type" is primary type and "format" is secondary type.
Stefan Schimanski: Added a paragraph below about our discussion around keeping type and format in the skeleton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only purpose of pruning that is clear to me is the security one. For that, formatting does not help. type information does not even help. The motivation for adding those during decoding seems to be "similarity to core types", which is not stated as an explicit goal of the design, but seems to be a theme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. It's mostly about getting similar error messages.
I would claim we have a general, high-level goal for CRDs to behave like native resources. This is not special for pruning and defaulting. Merely, this goal is one driver to implement pruning and defaulting (and many other features we are adding to CRDs).
- "@apelisse" | ||
approvers: | ||
- "@deads2k" | ||
- "@lavalamp - substituted by @fedebongio while @lavalamp is out" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm back, sorry!
Given the difficulty of fixing specs, and the fact that we have tools that should give correct specs, can we prune assuming the spec is correct and generated by kube-openapi / kube-builder? I think we should do this or just not prune at all until we can tell if the spec is acceptable or not. I'm not in favor of pruning based on specs that use features we don't support. |
Pruning or defaulting? Pruning based on the union of all mentioned fields won't strip any expected fields and will leave the reconciliation of different potential branches to validation where it is more appropriate. Pruning is to prevent field squatting and to have structural consistency in your stored data. I think the question about "correct" specs is more about defaulting where results become slightly harder to reason about. However, crisp rules about only defaulting on non-branching paths make the behavior predictable and easy to decide about. Separately, a decision to tighten the rules for specs we allow may be useful and we could have a path forward to to doing that based on status/failure to serve unless the "I know I'm going to be unservable in six months" flag is set. I don't think that we need to block on such a tightening to implement the pruning or defaulting though. |
Both, if they are writing special code to handle the weird specs--as much as I want defaulting. We shouldn't go fishing while our house is on fire. Honestly I would have blocked adding the validation field in the first place until this was fixed if I had realized how flexible the openapi schema was. I'm also extra interested in schemas being correct due to working on apply.
It might fail to prune items that should be pruned. Now, I think it is acceptable to not prune at all, either just simply rejecting unknown fields or letting the user deal with the fallout (ok for beta, likely not ok for GA). I also don't object to pruning. I think I object to pruning only most of the time, though.
It is certainly laudable that sttts noticed these issues and is working to address them. However I think it is fundamentally wrong that we accept specs that express behaviors we never intend to support or allow. I think that a stopgap solution of writing code that notices and works around the issue is actually worse than doing nothing, because of the precedent it sets (that we have to respond appropriately to every OpenAPI feature). In my view, all of our effort to rectifying the situation needs to go into explicitly defining the subset of specs we will support.
#64841 might be homing in on a general solution to that thorny problem. CRD is still beta so we have some wiggle room.
From my perspective, code that notices non-compliant specs and doesn't act on them (ideally getting an error somewhere visible) is fine and possibly our only choice given that tightening the validation will be a multi-release effort. But I view code that attempts to repair/skeletonize/reinterpret non-compliant OpenAPI specs and then use them anyway as detrimental. Code like that is incredibly difficult to get correct in my experience, and incredibly frustrating when it does the wrong thing. I especially object if the code is silent and doesn't give users any indication that they're working off of our non-standard interpretation of their OpenAPI spec. |
@sttts: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
b374868
to
e93e696
Compare
e93e696
to
2322c20
Compare
@sttts: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
I have tried hard to follow these harsh judgements about the approach, and I would have hoped to have an open and also early-on discussion about these instead of this form. And not surprisingly I disagree with them:
|
/kind kep |
There is an element of end-user experience in this one. What do we want them to experience? If we value consistency, the default experience for custom resources should be the same as built-in resources. If we apply the principle of least astonishment what should the experience be and how do we build an implementation from there? |
We value both consistency and experience. CRDs should behave consistently with built-in resources. If there are experience issues with some of the built-in resource behaviors, we should determine if we can improve those (and ensure consistent treatment for CRDs as well as we do so). |
Recording this for posterity....
I summary, we move forward where there is greatest demand and certainty, without blocking future options. |
I agree with @erictune. Pruning can be added later. CRDs, especially schema-less ones, are like annotations in many ways. Related past discussions: Either strict validation OR pruning can prevent the problems of circumventing validation, creating timebombs, and confusing controllers. Each has different gotchas for clients. |
And, for the record, while I haven't had time to read this KEP, I am in favor of pruning, API consistency, and declarative schema specification. But I don't think pruning can be required for all CRDs. |
REMINDER: KEPs are moving to k/enhancements on November 30. Please attempt to merge this KEP before then to signal consensus. Any questions regarding this move should be directed to that thread and not asked on GitHub. |
KEPs have moved to k/enhancements. Any questions regarding this move should be directed to that thread and not asked on GitHub. |
@justaugustus: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
No description provided.