Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow true and false for all schemas (except maybe the root schema) #101

Closed
handrews opened this issue Oct 17, 2016 · 33 comments
Closed

Allow true and false for all schemas (except maybe the root schema) #101

handrews opened this issue Oct 17, 2016 · 33 comments

Comments

@handrews
Copy link
Contributor

Currently, additionalItems and additionalProperties allow the booleans true and false as values, with the following equivalences:

true is equivalent to {}
false is equivalent to {"not": {}}

These are helpful because they more clearly indicate the intent of "this can't exist" and "this can exist in any form", particularly when it comes to false. They are purely syntactical convenience, but they become very convenient when doing more complex schema algebra to work with complicated schema re-use.

We should allow true and false anywhere a schema is allowed (although requiring an object schema for the root would be reasonable, if we don't want booleans on their own to be legal schema documents).

This will allow expressing things like "all properties except..." and "only these properties with any value..." very clearly. In particular, false is much more immediately understandable than {"not": {}}. Allowing true in all schema places also means that we can say {"not": true} == false.

{
    "properties": {
        "foo": false,
        "bar": false
    }
}
{
    "properties": {
        "foo": true,
        "bar": true,
    }
    "additionalProperties": false
}

It also makes it a bit more clear to build a schema that forbids additional properties based on two sets of properties combined (I've inlined the constituent schemas instead of using $ref which would be more normal):

{
    "definitions": {
        "firstSet": {
            "properties": {
                 "a": {...},
                 "b": {...},
                 "c": {...},
             }
        },
        "lastSet": {
             "properties": {
                 "x": {...},
                 "y": {...},
                 "z": {...}
             }
        }
    },
    "allOf": [
        {"$ref": "#/definitions/firstSet"},
        {"$ref": "#/definitions/lastSet"},
    ]
    "properties": {
        "a": true,
        "b": true,
        "c": true,
        "x": true,
        "y": true,
        "z": true
    },
    "additionalProperties": false
}

This reads a bit more clearly as "allow only these things, which have their real schemas defined in the allOf".

Of course this is just a syntax of convenience, but that's true of the existing true and false use as well. It's just as convenient in other places.

@Relequestual
Copy link
Member

It looks like you're wanting to add a different route to existing functionality. As you say, for convenience. Is that the case? Surely what's needed is better documentation in that case?

@handrews
Copy link
Contributor Author

Actually I'm trying to make the behavior consistent. You can use booleans as schemas in some places, and I was surprised when it turned out I could not specify a schema of false elsewhere. "{not: {}}" isn't incredibly intuitive no matter how much you document it- I assume that's why "additionalProperties" and "additionalItems" allow it. Consistency here is both nice for usability and cheap (in terms of implementation/performance).

@handrews
Copy link
Contributor Author

When you're messing with more complex schema algebra for refactoring or figuring out how to combine and then afterwards forbid additional properties, it makes things more clear. Also when negating various combinations of "properties", "patternProperties", "additionalProperties" (this comes up when trying to do the above in an automated fashion as a pre-processor as described on a thread on the google group a while back- I was calling it "expands" at the time although I will eventually file it as "combine" after I've gotten various schema algebra things filed to support it).

@handrews
Copy link
Contributor Author

Another way to put it: consistency is a good design quality. We should only be inconsistent when there is a compelling reason to do so (complexity of implementation, performance impact).

Currently we have two ways of specifying child schemas- if the keyword starts with "additional" you can use booleans, otherwise you can't, even though the convenience afforded by booleans is equally applicable everywhere else. Let's just have one definition of what a child schema is.

@Relequestual : What compelling reason do you see to not be consistent?

@Relequestual
Copy link
Member

It makes sense to seperate values that are not allowed to a seperate property to my mind.

A not on additionalProperties. I think the intension, when it's an object, is to allow for faster processing in terms of the way the library will validate, looking at https://github.com/json-schema/json-schema/wiki/Additionalproperties. I could be wrong.

I do see your point though, it is inconsistent to not allow for boolean properties.

Maybe the additionalProperties property was created to allow defining values which don't have a specific data type. I don't have a problem with that, as it encouridges defining of data types for properties. If you're not interested in the types of your json values, then it's questionable what you're doing, or even when you need json-schema.

It took me a little while to understand exactly what you were explaining, and why it's an issue of consistency from your perspective.

@awwright any thoughts on if the current state of this is by design or just an artifact of development over time?

@handrews
Copy link
Contributor Author

If a boolean on additionalProperties is intended to speed up processing, why is that benefit not applicable elsewhere? You happen to do different things with the result of additionalProperties, but the speed/readability benefits of false over {"not": {}} are identical anywhere you can have a child schema. There's no difference at all that I can think of (an implementation can internally safely translate {"not": {}} to false so it's not like it actually has to run negated validation).

It makes sense to seperate values that are not allowed to a seperate property to my mind.

Well, we don't have that property right now. Feel free to propose it :-)

Right now, the way to say that a property (or anything) is forbidden is to give it an impossible schema. false is the impossible schema that most clearly communicates that intent, so forbidding it in some places but not all is just weird.

@Relequestual
Copy link
Member

It makes sense to seperate values that are not allowed to a seperate property to my mind.

Well, we don't have that property right now. Feel free to propose it :-)

I meant, not. I was saying not makes sense to me as a property name.

In terms of speed benefit. It looks like according to https://github.com/json-schema/json-schema/wiki/Additionalproperties#sample-algorithm, the streaming validation means that once it's valid, further checks (like additionalProperties) is not looked at. Only if initial schema validation fails, additionalProperties is looked at. I could be wrong, but that's my take on the (possible) justification.

@handrews
Copy link
Contributor Author

By "initial schema validation" do you mean validation of the object itself (meaning its property names) without regard for child value? If so, additionalProperties is checked first, because unless it is set to an impossible schema, validation of the object always succeeds.

false is just shorthand for "intentionally impossible schema"

Anywhere we can use false (and true) allows the validator to skip examining the child value. Because of the nature of additionalProperties there are further implications to that, but as far as looking at the child schema the benefits are identical.

I'm not 100% sure what you mean by not as a property name? It's already a schema keyword? My point with false is that it is more clear and simpler to implement than anything involving schemas negated with not.

@Relequestual
Copy link
Member

not is already a key word: http://json-schema.org/latest/json-schema-validation.html#anchor91

Regarding your comment on my comment: I linked to the sample algorithm. It reads to me like, properties are checked first, and if it isn't valid after that, then you check additionalProperties and patternProperties. I could be wrong, but read that, it made me feel that maybe this was the design decision arround having additionalProperties as something more than just allowing a seperation of an objects properties. From what I read, it also allows you to define properties without type. JSON Schema assumes that you'll normally want to define types, which I think is a good thing to encouridge.

@handrews
Copy link
Contributor Author

We're having a lot of trouble communicating today :-(

Admittedly, one sentence was really misleading: When I said "It's already a schema keyword?" I did not mean "Is it already a schema keyword?" I meant "It's already a schema keyword, what do you want to do differently?"

We are both in agreement that not is already a keyword. What I don't understand is what you want to do with it other than what it already does, and how that relates to this proposal. Or even what things that it already does that you think affect this proposal.

It reads to me like, properties are checked first, and if it isn't valid after that, then you check additionalProperties and patternProperties.

That's the algorithm for verifying child schemas, which is why I asked what "initial schema validation" meant. The algorithm for verifying object instances is in section 5.18 except that now I see that @awwright changed it and broke it's independence :-(

It used to say that if additionalProperties was not false, validation of the keyword always succeeds. additionalProperties should not have any dependencies for object validation, only for child validation.

I'm going to file a separate issue on that as I think it is a problem. I am hoping we can have a more thorough and detailed review before the next draft is published. This went through a ton of changes and then was shipped a few days later- I'm still finding surprises like this.

@Relequestual
Copy link
Member

I think another part of our bad communication is possibly due to my lack of understanding of the spec.

I wasn't suggesting doing anything more with not.
I think I'm too jet lagged to really focus on this any more. I'll see if I can work out anything further next week! =/ sorry.

@awwright
Copy link
Member

I don't see any obvious problems with this, it might help remove some redundant language. Just need to think through the possible effects of allowing true/false anywhere a schema is.

@awwright awwright added this to the draft-6 milestone Oct 19, 2016
@handrews
Copy link
Contributor Author

I think I'm too jet lagged to really focus on this any more. I'll see if I can work out anything further next week! =/ sorry.

No worries- thanks for your patience, hope I wasn't too annoying. Sleep well!

@epoberezkin
Copy link
Member

👍

@Relequestual
Copy link
Member

On reflection, I think I'm OK with this

@epoberezkin
Copy link
Member

epoberezkin commented Oct 31, 2016

I can repeat that I like it too - we allow booleans where the schema is possible already, so allowing it everywhere where a schema is required means increasing consistency, which is a good thing. It would allow to simplify description of additionalItems and additionalProperties not to mention booleans at all and always talk about schemas. And instead we add a general note saying that where the schema is required true can be used instead of {} and false instead of {not:{}}

@handrews
Copy link
Contributor Author

This might qualify as "rough consensus" now :-) I can write a pull request, although it would be good to get PR #111 sorted first as adding the true/false thing will affect some of the same wording.

@DimDragon
Copy link

Consistency is worthy goal, however I'm bit scared by this statement:

We should allow true and false anywhere a schema is allowed

How we should treat this for validation?

"dependencies": false

I also find it hard to justify schema like:

{
  "type": "array",
  "items": false
}

Please also note that if some of merge/patch semantics are adopted, this provides needed support for restricting/expanding properties (second sample)

@handrews
Copy link
Contributor Author

"dependencies": false

dependencies doesn't take a schema, it takes an object which maps property names to either schemas or other property names. So you could have "dependencies": {"foo": false} which means "if this schema has a property of "foo", it fails validation". There's no reason you would do this, as you could get the same effect more clearly with "properties": {"foo": false}. But there are plenty of things you can do in JSON Schema that don't make sense, so that's not a reason to avoid this. It's just a reason not to use it with dependencies.

Similarly, with your items example, why is that a problem? That schema just says "this thing can't be an array" (because items only applies to arrays). I'd rather write that as {"not": {"type": "array"}}, but there are lots of other absurd ways to write it whether you can put false there or not.

So, what is the real problem here? In all cases, you can just replace false with {"not": {}}, so there is no functional difference with any of these. They're just easier to read. They're just as useful, useless, or nonsensical as they are with {"not": {}}.

@epoberezkin
Copy link
Member

@handrews good explanation.

Although the reason to use "dependencies": {"foo": false} could be the different reporting, as in this case the whole object fails validation and in case "properties": {"foo": false} property foo fails validation, so the error would be reported on a different level. In some cases this distinction can be useful.

@epoberezkin
Copy link
Member

epoberezkin commented Oct 31, 2016

And currently the same distinction can be achieved too by using {"not": {}} instead of false.

@epoberezkin
Copy link
Member

@handrews the implication of this change is that you would have to add on the high level that schema is either an object or a boolean value (in two degenerate cases - always passes/fails). I don't mind that, that's just an observation.

@DimDragon
Copy link

@handrews : sorry wrong sample indeed on dependencies. I meant to say "dependencies": {"foo": false} and as @epoberezkin noted, it is not really clear what the goal is in term of validation in that case.

Also my array sample you read as:

'this thing can't be an array',

where I would read it more programmatically by saying: 'this is an array, which allows no items'. So what kind of 'array' is this. I accept your note that you can write the same now with not semantics.

Don't get me wrong, I'm not going against the proposal, just trying to cover other angles, which may cause misinterpretation.

@handrews
Copy link
Contributor Author

Don't get me wrong, I'm not going against the proposal, just trying to cover other angles, which may cause misinterpretation.

Sounds good. We want to get the phrasing right and this helps. Thank you for clarifying that you are not against the proposal- it can be hard to get these resolved otherwise as I think we all tend to assume that questions are at least potentially objections.

@epoberezkin
Copy link
Member

'this is an array, which allows no items'

{"items": false} is the same as {"items": [], "additionalItems: false} or {"maxItems": 0}.
The first is arguably more expressive :)

Who needs empty arrays though :)

@Relequestual
Copy link
Member

I need empty arrays sometimes! Shows you've looked for a thing but have none, rather than omitting.

handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 8, 2016
@jschnaitmann
Copy link

Just for some clarification on my side: would that whole 'false' functionality not be unnecessary if the behavior for an absent additionalProperties property would just default to 'false'? What is the rationale behind additionalProperties defaulting to true? Doesn't this kind of contradict the whole 'schema definition' idea (just allowing everything by default)?

I.e. no additionalProperties means false, additionalProperties: {} means true?

Consequently, the same would be true for an array - { "type": "array" } would define that empty array, in opposition to { "type": "array", "items": { "type": "string" } } and, the broadest definition { "type": "array", "items": {} }.

@handrews
Copy link
Contributor Author

One key principle is of JSON Schema is that the blank schema {} validates everything. Therefore additionalProperties must default to a blank schema as well. The use of true and false don't change anything about the behavior of JSON Schema, just its readability and clarity of intent (and in a few cases, makes it easier to optimize implementations).

@jschnaitmann
Copy link

What's the rationale behind this principle? I do not see any reference to this anywhere in the specification. In fact, the specification, IMO, says something different, as the word 'may' is used?

 If "additionalProperties" is absent, it may be considered present with an empty schema as a value.

https://tools.ietf.org/html/draft-wright-json-schema-validation-00#section-5.18
It is however mentioned directly on json-schema.org, without any reasoning however.

It seems like this is a very basic question - should {} validate anything, or nothing (i.e. only empty objects)? Besides pros and cons for/against one or the other - in a very general sense:

  • the former approach makes sense if JSON schemas are mostly used to describe some common sub-sets of otherwise different, larger data-sets
  • the latter makes more sense if JSON schemas are mostly used to describe whole data-sets, with unstructured data, if present, normally being confined to sub-properties (having additionalProps set to {}/true).

Without having data about how JSON schema is actually used, I'd still argue that the second approach is more intuitive - on the one hand, because it matches the 'traditional' schema definitions (e.g. for a SQL database) more closely, and secondly because schemas are more expressive/explicit. If e.g. additionalProperties is specifically set to true/{}, someone not familiar with JSON schema will probably understand the meaning more easily - or at least, could explicitly search for an explanation of that keyword.

@handrews
Copy link
Contributor Author

It seems like this is a very basic question - should {} validate anything, or nothing

I could have sworn this is described somewhere, but I can't find it right now. If it is really missing, then it falls under #55 (Clearly document validation principles) and I will make a PR for it shortly to add it to section 4 of the validation spec (I just woke up so I might be missing where it's specified right now).

It is definitely the case that {} validates everything. The principle here is that JSON Schema is a constraint-based system where each keyword further constrains the set of instances that will successfully validate against the schema. This principle has been referenced many times in JSON Schema discussions, which is probably why I thought it was in the spec.

The "may" wording (note that lower-case "may" does not have the same formal definition as upper-case "MAY") has been discussed a bit, along with the nature of {}, in #131 , which tells me that we really should clean up that wording. If it's been confusing enough to be filed twice in the past two weeks, it's probably confusing more people than are bothering to show up and file an issue.

@Relequestual
Copy link
Member

@handrews Great discussion. Please mention this issue in your relating PR for comments. I may have some ammendments or suggestions, but no point in laying out my ideas if you'll already cover them, imo.

handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 15, 2016
This addresses issue json-schema-org#55 plus concerns raised in the comments of
issue json-schema-org#101.

I replaced "linearity" with "independence" as I think it is
more general and intuitive.

The general considerations section has been reorganized
to start with the behavior of the empty schema, then explain
keyword independence, and finally cover container vs child
and type applicability, both of which flow directly from
keyword independence.

In draft 04, the wording obscured the connection between
keyword independence and container/child independence.
When we rewrote the array and object keywords to explicitly
classify each keyword as either validating the container
or the child, keyword independence became sufficient to
explain container/child independence.

The list of non-independent keywords has been updated, and
exceptions to the independence of parent and child schemas
have been documented.  Finally, I added a comprehensive example
of the frequently-confusing lack of connection between
type and other keywords.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 16, 2016
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 16, 2016
This addresses issue json-schema-org#55 plus concerns raised in the comments of
issue json-schema-org#101.

I replaced "linearity" with "independence" as I think it is
more general and intuitive.

The general considerations section has been reorganized
to start with the behavior of the empty schema, then explain
keyword independence, and finally cover container vs child
and type applicability, both of which flow directly from
keyword independence.

In draft 04, the wording obscured the connection between
keyword independence and container/child independence.
When we rewrote the array and object keywords to explicitly
classify each keyword as either validating the container
or the child, keyword independence became sufficient to
explain container/child independence.

The list of non-independent keywords has been updated, and
exceptions to the independence of parent and child schemas
have been documented.  Finally, I added a comprehensive example
of the frequently-confusing lack of connection between
type and other keywords.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 17, 2016
This paritally addresses issue json-schema-org#55 plus concerns raised in
the comments of issue json-schema-org#101.

I replaced "linearity" with "independence" as I think it is
more general and intuitive.

The general considerations section has been reorganized
to start with the behavior of the empty schema, then explain
keyword independence, and finally cover type applicability.

In draft 04, the wording obscured the connection between
keyword independence and container/child independence.
I thought I needed this primitive type vs child validation
section even with the rewritten keywords, but going over
it now based on feedback, I agree that it is superfluous.

The list of non-independent keywords has been updated to
include minimum/maximum and their "exclusive" booleans.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 17, 2016
This paritally addresses issue json-schema-org#55 plus concerns raised in
the comments of issue json-schema-org#101.

I replaced "linearity" with "independence" as I think it is
more general and intuitive.

The general considerations section has been reorganized
to start with the behavior of the empty schema, then explain
keyword independence, and finally cover type applicability.

In draft 04, the wording obscured the connection between
keyword independence and container/child independence.
I thought I needed this primitive type vs child validation
section even with the rewritten keywords, but going over
it now based on feedback, I agree that it is superfluous.

The list of non-independent keywords has been updated to
include minimum/maximum and their "exclusive" booleans.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 17, 2016
This implements issue json-schema-org#101

Also add JSON Reference schema for subSchemas.

Since "$ref" is now only allowed as a JSON Reference where a schema
is acceptable, it can now be described in JSON Schema and included
in the "anyOf" that defines legal subschemas.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 20, 2016
This implements issue json-schema-org#101

Also add JSON Reference schema for subSchemas.

Since "$ref" is now only allowed as a JSON Reference where a schema
is acceptable, it can now be described in JSON Schema and included
in the "anyOf" that defines legal subschemas.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 21, 2016
This paritally addresses issue json-schema-org#55 plus concerns raised in
the comments of issue json-schema-org#101.

I replaced "linearity" with "independence" as I think it is
more general and intuitive.

The general considerations section has been reorganized
to start with the behavior of the empty schema, then explain
keyword independence, and finally cover type applicability.

In draft 04, the wording obscured the connection between
keyword independence and container/child independence.
I thought I needed this primitive type vs child validation
section even with the rewritten keywords, but going over
it now based on feedback, I agree that it is superfluous.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 21, 2016
This implements issue json-schema-org#101

Also add JSON Reference schema for subSchemas.

Since "$ref" is now only allowed as a JSON Reference where a schema
is acceptable, it can now be described in JSON Schema and included
in the "anyOf" that defines legal subschemas.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 22, 2016
This implements issue json-schema-org#101

Also add JSON Reference schema for subSchemas.

Since "$ref" is now only allowed as a JSON Reference where a schema
is acceptable, it can now be described in JSON Schema and included
in the "anyOf" that defines legal subschemas.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 25, 2016
This implements issue json-schema-org#101, including root schemas.

Also add JSON Reference objects as legal schemas in the meta-schema.

Since "$ref" is now only allowed as a JSON Reference where a schema
is acceptable, it can now be described in JSON Schema and included
in the "anyOf" that defines legal subschemas.
@handrews
Copy link
Contributor Author

Posted PR #167 as an alternative to #128 at @awwright's request. The only difference is that #167 supports booleans for root schemas as well as subschemas. See the PR for further details.

handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 29, 2016
This is one alternative for issue json-schema-org#101.

If accepted, the meta-schema change will come separately after
PR json-schema-org#168 is merged.
handrews added a commit to handrews/json-schema-spec that referenced this issue Nov 29, 2016
This is an option for implementing issue json-schema-org#101, this time not just
for subschemas but for all schemas including the root schema.

The meta-schema changes will be added after PR json-schema-org#168 is approved,
at which point the changes for this will be much simpler.
@handrews
Copy link
Contributor Author

handrews commented Dec 5, 2016

#167 is merged which resolves this.

@handrews handrews closed this as completed Dec 5, 2016
handrews added a commit to handrews/json-schema-spec that referenced this issue Dec 5, 2016
This paritally addresses issue json-schema-org#55 plus concerns raised in
the comments of issue json-schema-org#101.

I replaced "linearity" with "independence" as I think it is
more general and intuitive.

The general considerations section has been reorganized
to start with the behavior of the empty schema, then explain
keyword independence, and finally cover type applicability.

In draft 04, the wording obscured the connection between
keyword independence and container/child independence.
I thought I needed this primitive type vs child validation
section even with the rewritten keywords, but going over
it now based on feedback, I agree that it is superfluous.
handrews added a commit to handrews/json-schema-spec that referenced this issue Dec 12, 2016
This paritally addresses issue json-schema-org#55 plus concerns raised in
the comments of issue json-schema-org#101.

I replaced "linearity" with "independence" as I think it is
more general and intuitive.

The general considerations section has been reorganized
to start with the behavior of the empty schema, then explain
keyword independence, and finally cover type applicability.

In draft 04, the wording obscured the connection between
keyword independence and container/child independence.
I thought I needed this primitive type vs child validation
section even with the rewritten keywords, but going over
it now based on feedback, I agree that it is superfluous.
handrews added a commit to handrews/json-schema-spec that referenced this issue Dec 13, 2016
These are the leftover bits of Issue json-schema-org#55 and some clarifications
requested in a comment on issue json-schema-org#101 that have not already been
added in some other PR for some other issue.

These specific chagnes were previously approved in json-schema-org#143, but so
many other things have changed since json-schema-org#143 that most of it was
no longer relevant, so I closed it and started these changes over.

In particular, explaining {} and {"not": {}} is no longer needed
as they are covered while introducing "true" and "false" schemas
in the core specification, so that is no longer repeated in this
change.

Likewise, the parent/child validation descriptions have been
modified in several PRs and no longer has the problems that were
previously a concern.
handrews added a commit to handrews/json-schema-spec that referenced this issue Dec 27, 2016
These are the leftover bits of Issue json-schema-org#55 and some clarifications
requested in a comment on issue json-schema-org#101 that have not already been
added in some other PR for some other issue.

These specific chagnes were previously approved in json-schema-org#143, but so
many other things have changed since json-schema-org#143 that most of it was
no longer relevant, so I closed it and started these changes over.

In particular, explaining {} and {"not": {}} is no longer needed
as they are covered while introducing "true" and "false" schemas
in the core specification, so that is no longer repeated in this
change.

Likewise, the parent/child validation descriptions have been
modified in several PRs and no longer has the problems that were
previously a concern.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants