Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add requireAllExcept keyword #1144

Closed

Conversation

jdesrosiers
Copy link
Member

Resolves #1112

Adds the requireAllExcept keyword.

jsonschema-validation.xml Outdated Show resolved Hide resolved
jsonschema-validation.xml Outdated Show resolved Hide resolved
@karenetheridge
Copy link
Member

I don't think we do that anywhere else, so it would be a little out of place.

Fair enough!

@awwright
Copy link
Member

I find this keyword somewhat difficult to comprehend. It's behavior is defined in terms of another keyword, which I think we should avoid. It becomes non-intuitive when mixing "allOf", because this keyword does not "see into" sub-schemas.

What would be some cases where this keyword is superior to "requiredProperties"?

@Relequestual
Copy link
Member

I find this keyword somewhat difficult to comprehend. It's behavior is defined in terms of another keyword, which I think we should avoid. It becomes non-intuitive when mixing "allOf", because this keyword does not "see into" sub-schemas.

I don't feel this is a problem. Many keywords have dependencies. That's one of the reason for the annotations system.

What would be some cases where this keyword is superior to "requiredProperties"?

@awwright
Earlier you said...

The problem would be if keywords do multiple things and there's no way around it.

By my understanding of your proposal #846, schema authors could use both properties and requiredProperties to define properties. requiredProperties would be doing multiple things... defining subschemas to apply, and specifying they are required.

Generally we've moved away from keywords doing multiple things, splitting them up, mainly because it was confusing. requiredProperties as you proposed in #846 feels like a step backwards.

@awwright
Copy link
Member

@Relequestual By "... and there's no way around it", I mean: It's fine if a keyword performs multiple things as an author convenience. But we have to be able to break it apart.

e.g. {"type": "integer"} is short for { "type": "number", "multipleOf": 1 }.

Likewise, {"requiredProperties": {"foo": bar}} could be short for { "required": ["foo"], "properties": { "foo": bar } }.

This keyword, by contrast, has interactions with other keywords. What happens if I use:

{
"requireAllExcept": ["foo"],
"allOf": [ { "foo": false } ],
"patternProperties": { "f": true },
"additionalProperties": false
}

What's the behavior here? There probably is one—but I have to think about it. And I can't really take an educated guess. I can't even come up with a good example, maybe if I replaced false with true here, or vice-versa, the example would illustrate my point better.

It's true that we have other keywords like this (additionalProperties), but they tend to consume a disproportionate amount of descriptive effort we have to do.

Generally we've moved away from keywords doing multiple things, splitting them up, mainly because it was confusing.

This feels like exactly the kind of keyword that does multiple things you're talking about.

Copy link
Member

@Relequestual Relequestual left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reviewing this further, I think some clarification is required.
I've left a suggested change.

jsonschema-validation.xml Outdated Show resolved Hide resolved
Copy link
Member

@awwright awwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I overall object to this because it "reads" the value of other keywords, which is something we try to avoid, see "Keyword independence".

However if this keyword absolutely must be added, we would need to list this in said "Keyword independence" section. (Or rather, since this keyword does nothing if in a schema all by itself, it may be more consistent to say the "properties" keyword reads this one.)

@gregsdennis
Copy link
Member

@awwright what about additionalProperties, unevaluated*, minCount, maxCount? These are all keywords that consider other keywords. While we have keyword independence, it's not a hard requirement for keywords.

@awwright
Copy link
Member

@gregsdennis Those keywords are specifically listed as exceptions... I mean, yes, we never really went into detail as to why they're exceptions... but as an exercise, write an alternative where the only information available to the keyword is its own value.

The best I can do is something like

{
   "contains": {
      "needle": "foo",
      "min": 1,
      "max": 4
   }
}

While this nicely emphasizes how "minContains" doesn't do anything by itself—that it's technically an argument to "contains"—this is obviously clunky to write, which is why I think an "argument keyword" is acceptable as an authoring convenience.

In contrast, the single-keyword alternative to "requireAllExcept" combines the functionality of "required" and "properties" into a single keyword, entirely replacing the need for "required"; all the while "properties" can maintain its usual function for listing optional properties that "requireAllExcept" keyword is doing.

This idea that some keywords can be decomposed into more complicated but equal schemas should be familiar, e.g. type: "integer" and "dependentRequired".

※ A keyword "doesn't anything by itself" if the single-keyword schema {keyword: foo} is indistinguishable from {}, for all values of foo. (As for "additionalProperties", this does do something by itself, but perhaps it should have had the semantics that "unevaluatedProperties" has had from the beginning... or maybe something else entirely, I'm still researching this.)

@gregsdennis
Copy link
Member

the single-keyword alternative to "requireAllExcept" combines the functionality of "required" and "properties" into a single keyword, entirely replacing the need for "required"

I don't follow this.

The new keyword is an array of property names that are optional. The presence of the keyword implies that all properties in properties are required except the ones in the array.

It's not combining properties and required; it's the inversion of required.

@awwright
Copy link
Member

awwright commented Apr 26, 2022

the single-keyword alternative to "requireAllExcept" combines the functionality of "required" and "properties" into a single keyword, entirely replacing the need for "required"

I don't follow this.

I'm not sure what you don't follow, so let me elaborate a little bit.

"requireAllExcept" is a "keyword" but it's not a validation keyword per se, it's an argument keyword: It's modifying the behavior of another (validation) keyword. We try to avoid argument keywords unless there's no better way to do it. So, is there a better way to accomplish the goal here? Yes. We can accomplish the same behavior by reworking the keyword that's doing the heavy lifting.

Take for example:

{
   properties: {
      "name": {type: "string"},
      "comment": {type: "string"}
   },
   requireAllExcept: [
      "comment"
   ]
}

How do we adapt "requireAllExcept" so that it operates by itself? It can pull in the schema for each of the properties:

{
   properties: {
      "name": {type: "string"},
   },
   requireAllExcept: {
         "comment": {type: "string"}
   }
}

There, now "requireAllExcept" can be used by itself. And we're not redundantly listing the "comment" key name!

But wait, isn't "requireAllExcept" just listing optional properties here—identical to how "properties" works right now? Let's swap the names around.

{
   requireAll: {
      "name": {type: "string"},
   },
   properties: {
      "comment": {type: "string"}
   }
}

Is there possibly a better name for "requireAll" that indicates it accepts a key=>schema map?

{
   requiredProperties: {
      "name": {type: "string"},
   },
   properties: {
      "comment": {type: "string"}
   }
}

...so "requiredProperties" is the same thing as "requireAllExcept", except not broken.

@gregsdennis
Copy link
Member

With this approach you also need to update additionalProperties and unevaluatedProperties to look into the new requiredProperties.

@awwright
Copy link
Member

@gregsdennis Yes— this is sort of implied when I say "requiredProperties" would 'behave the same way as using "required" and "properties"'—but for consistency you are correct, it would be a good idea to update the definitions of those keywords.

@awwright
Copy link
Member

Also, to bring this back to my original point, "requireAllExcept" also needs to update some other paragraphs, specifically the section(s) that discuss keyword independence, so it can be added as one of the exceptions.

@Relequestual
Copy link
Member

Reading the more recent comments, I think we need to re-evaluate our approach here.

I know we're not all happy with the annotations system as defined, however I don't think that means we intend to throw it out. I feel that we can still use it, better defined, and any "keyword interactions" should rely on the premise annotation collection.

I'm VERY strongly opposed to adding another way to list properties which effects additionalProperties and unevaluatedProperties. I feel it will make things all the more confusing. I don't feel the trade-off is worth it.

Adding new constraint base keywords without effecting existing keywords feels preferable to me.

@awwright
Copy link
Member

I feel that we can still use it, better defined, and any "keyword interactions" should rely on the premise annotation collection.

Can you detail what this would look like?

I'm VERY strongly opposed to adding another way to list properties which effects additionalProperties and unevaluatedProperties. I feel it will make things all the more confusing. I don't feel the trade-off is worth it.

We're dealing with an authoring convenience, i.e. how to combine common patterns of keywords into a single keyword. There's going to be inherent complexity in that, regardless of the solution.

There's also some amount of subjectivity. We're going to have to balance what new people would expect, with what keeps the cognitive requirements low on very large schemas. (Also, these things might be the same.) I think all authoring conveniences will introduce some "surprise", but we can minimize that; the real purpose of an authoring convenience is it lets you build more complicated schemas on the same amount of brain power (it is probably more difficult to work with an if/then statement than to think "x property means y becomes required" — and so we have "dependentRequired").

We may want to put out a survey that tries to measure these two properties (expected/surprising behavior, and scalable/non-scalable for authors).

... without effecting existing keywords feels preferable to me.

I don't think there's a way to do this.

You can think of "requireAllExcept" as a keyword that first reads the value of "properties", then returns a validation result; or you can think of it as an argument to "properties" (which currently reads no arguments); these two interpretations are logically indistinguishable. However I'm inclined to say we should think of it only an argument to "properties", because the single-keyword schema { "requireAllExcept": anything } has no behavior.

(An alternative name for these "argument keywords" could be "interacts with"... because even though "minContains" is an argument keyword to "contains", it still makes sense to talk about "minContains" as the source of validation errors. It just won't do so when it's the only keyword in the schema.)

@Relequestual
Copy link
Member

Can you detail what this would look like?

Yes, but I'll have to get back to you. I'm overflowing with half done work right now 😭

@jdesrosiers jdesrosiers force-pushed the require-all-except branch from 6e4ab89 to 5d35b16 Compare May 27, 2022 20:11
@jdesrosiers
Copy link
Member Author

@awwright we would need to list this in said "Keyword independence" section.

Agreed. This was an oversight. I updated the PR.

@jdesrosiers
Copy link
Member Author

@awwright

"minContains" doesn't do anything by itself [] it's technically an argument to "contains"

This is not the way minContains and maxContains are defined. They assert independently. For example, if you have schema { "contains": { "type": "number" }, "maxContains": 1 } and instance [1, 2], then contains will pass and maxContains will fail. Personally, I'd prefer if minContains and maxContains were just arguments to contains, but that's not the way it was defined.

<t>
An object instance is valid against this keyword if every property name
declared in "properties" is the name of a property in the instance, with
the exception of the property names that appear in this keyword's array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need clarification on what the exception means. This could be read as "the properties in the array must not be present."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. I'll try to reword.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed some alternate wording.

jsonschema-validation.xml Outdated Show resolved Hide resolved
jsonschema-validation.xml Show resolved Hide resolved
@jdesrosiers
Copy link
Member Author

@awwright

"requireAllExcept" is a "keyword" but it's not a validation keyword per se, it's an argument keyword: It's modifying the behavior of another (validation) keyword.

This isn't correct. requireAllExcept does it's own assertion. It does not affect the validation result of properties. You could say that properties is an argument of requireAllExcept, but not the other way around.

@awwright
Copy link
Member

awwright commented May 27, 2022

requireAllExcept does it's own assertion

@jdesrosiers See my full explanation at #1144 (comment); by "per se," I mean "by itself." Since JSON Schema is declarative, there's multiple equivalent ways validation could be performed in actuality, (1) the way I'm thinking about it where you perform the validation at the moment you're iterating through the keywords in "properties" and making sure that each property is either in the instance, or listed in "requireAllExcept"; or (2) your implementation where you validate it at the time the "requireAllExcept" keyword is encountered.

((3) You can also write a validator that performs all the validations at the same time, and you can do so deterministically, proving that "where" the validation occurs truly doesn't matter.)

What makes a keyword an "argument keyword" is that a schema consisting only of e.g. "requireAllExcept" will never be invalid.

@awwright
Copy link
Member

Aside: By "at the same time", I mean you can compile a validator for a good number of schemas down to a finite state machine; since common states are factored out, it doesn't make sense to consider a failure to be "from" one keyword or another (the best you can do is say this failure occurs because of the existence of a keyword in the schema, and would have been valid if not for its existence).

@karenetheridge
Copy link
Member

I've seen implementations that evaluate all object-based keywords together, all array- together, etc. I'm not a fan of this approach, and I chose to implement each keyword separately in its own function, because it made it easier to selectively enable/disable individual keywords depending on which version of the specification was active -- but what matters is the outcome. As long as the validation result and emitted errors/annotations are correct, do whatever makes sense for your mental model and your choice of language/architecture.

This is, FWIW, why I think we have such divisive arguments about evaluation behaviour sometimes -- the implementation choices and mental models vary quite widely, and this informs our beliefs about how new features ought to work.

Comment on lines 516 to 517
An object instance is valid against this keyword if every property name
declared in "properties" within the same schema object is the name of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider the advantages/disadvantages of having this keyword use annotations from properties, which would allow it to use property definitions from other schemas via composition.

I could see this pattern being quite useful:

  $defs:
    model.user:
      type: object
      properties:
        id:
          type: string
        name:
          type: string
        type:
          enum: [foo, bar]
        comment:
          type: [string, 'null']
    create_operation:
      $ref: '#/$defs/model.user'
      requireAllExcept: [id, comment]
    get_operation:
      $ref: '#/$defs/model.user'
      requireAllExcept: []

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[briefly de-lurks]
@karenetheridge it's worth considering that these might be two separate use cases. "require all of these properties right here" is a straightforward thing that would become hard if this always used annotations from applicators in subschemas. And trying to make it optional within the keyword would introduce more complex syntax.
[re-lurks]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good points. It's worth discussing.

@jdesrosiers jdesrosiers force-pushed the require-all-except branch from b862023 to b9c2e9f Compare June 1, 2022 20:06
@jdesrosiers
Copy link
Member Author

@awwright

What makes a keyword an "argument keyword" is that a schema consisting only of e.g. "requireAllExcept" will never be invalid.

I see what you're trying to say, but I don't think requireAllExcept has that behavior because it's an argument keyword. I think it's an accidental convergence of behavior. By your definition, additionalProperties would not be an argument keyword and requireAllExcept would despite that the keywords work exactly the same way.

I think it would be fair to call something an argument keyword if it has no meaning if not in the presence of another keyword (Examples: minContains, then). This is not the case for requireAllExcept. Using it in a schema by itself still has meaning, just not a very useful meaning. It's equivalent to { "required": [] }. The schema would always validate, but not because it's ignored. There just happens to not be a value that will make the assertion false.


Let me take a step back and acknowledge that I share your concerns about adding a new keyword that breaks keyword independence. In fact, I hate it. I'd rather be finding ways to remove or redefine all such keywords, but weighing the pros and cons, I think requireAllExcept is the least bad option right now. You can look at the same set of tradeoffs and come to a different conclusion just by weighing things differently than I do. That's fine. Here's a list of the things I've taken into account before championing this approach (I'm sure I'm forgetting some, but this should be the most important points).

  1. The problem of maintaining schemas with a long list of required properties is one of the most common criticisms we get about JSON Schema.
  2. requireAllExcept doesn't solve the problem of duplicating property names declared in properties in all cases, but it does reduce the burden. requiredProperties doesn't require duplicating property names.
  3. requiredProperties would create two ways to define properties.
  4. requiredProperties does more than one thing and is both an applicator and an assertion.
  5. requiredProperties would require changing the definitions of keywords that depend on properties such as additionalProperties while requireAllExcept can be added without modifying the behavior of any existing keywords.
  6. requireAllExcept breaks keyword independence, but does so in a very familiar way (works just like additionalProperties).

@jdesrosiers jdesrosiers changed the base branch from draft-next to main July 8, 2022 15:32
@jdesrosiers
Copy link
Member Author

The draft-next branch has been merged and is now closed. The merge target for this PR has been changed to main. Here are the recommended steps to get your branch reabsed properly.

  1. Make sure your remote for the json-schema-org/json-schema-spec repo is up-to-date. (Example: git fetch upstream).
  2. Rebase your commits onto main. (Example: git rebase --onto upstream/main abcd123~1 (replace abcd123 with the commit hash of the first commit in your PR)).
  3. Force push the rebased branch to your fork. (Example: git push --force origin my-branch).

@jdesrosiers jdesrosiers force-pushed the require-all-except branch from 66d4638 to e8c3583 Compare July 8, 2022 15:42
@jdesrosiers jdesrosiers force-pushed the require-all-except branch from e8c3583 to 37eacba Compare July 8, 2022 15:44
@gregsdennis
Copy link
Member

gregsdennis commented Aug 1, 2022

The problem I have with how requireAllExcept is being described comes when trying to fit this into @handrews' behaviors model, which aims to prevent keywords from "looking into" other keywords. properties generates an annotation of the names of the properties it evaluates (which is the the intersection of the keys in its value and the keys in the instance).

(from @awwright's examples above)

{
   "properties": {
      "name": {"type": "string"},
      "comment": {"type": "string"}
   },
   "requireAllExcept": [
      "comment"
   ]
}

That means that for { "name": "foo", "bar": "baz" }, properties in this schema would generate [ "name" ]. This means that requireAllExcept must look into properties in order to see which properties are defined that aren't in the annotation.

A possible solution to this would be to have properties emit a different form of annotation that includes all of the defined property names plus information on which ones were found. Something like { "name": true, "comment": false } would work, but other forms could do. Then requireAllExcept would just need to look at the annotation of properties, find all of the false values, and verify that none of those are in its own list. (additionalProperties, etc, would only need to look at the keys of this new output from properties.) This, then, fits into the model where annotations are the sole communication lines between keywords.

Also with this approach additionalProperties, etc, would need to be updated for the new properties annotation shape, but I think this is a minimal change.

@jdesrosiers
Copy link
Member Author

@gregsdennis You're right, this should probably be defined based on annotations (although I still think that approach needs to be revisited) and the current annotation behavior of properties is insufficient for requireAllExcept to make it's assertion.

From a classification point of view, this keyword should work exactly like additionalProperties except that it's not an applicator.

@gregsdennis
Copy link
Member

This PR needs to be rewritten as a proposal document. See #1450 for an example.

Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggestion: optional keyword to complement required
7 participants