-
-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split Overview into the two specific use cases #1370
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -125,57 +125,51 @@ | |
</t> | ||
</section> | ||
|
||
<section title="Overview"> | ||
<!-- JSON Schema accomplishes two objectives, which each get their own section. --> | ||
<section title="Validation"> | ||
<t> | ||
This document proposes a new media type "application/schema+json" to identify a JSON | ||
Schema for describing JSON data. | ||
It also proposes a further optional media type, "application/schema-instance+json", | ||
to provide additional integration features. | ||
JSON Schemas are themselves JSON documents. | ||
This, and related specifications, define keywords allowing authors to describe JSON | ||
data in several ways. | ||
A JSON Schema document describes a validator (also known as a "recognizer" or "acceptor") which classifies a provided JSON document as "accepted" or "rejected." | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does the schema describe a validator? I would expect people think of the "validator" as the implementation, not the document. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah that makes sense... There's a sense in which these two uses are actually the same, the "validator implementation" is just a generic form of validator that is configurable. Like if I have a schema, then if the program is written or compiled to work only with that schema, or if it's generic and configured at runtime, makes no difference. Is there a better name for "the program that tests an input against some specific schema"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think you understand my point. Colloquially, the "validator" is the implementation, not the schema. I think we need to stick with this. Saying the schema itself is the validator will be confusing. A validator evaluates JSON against a schema. The schema is no more than configuration. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I believe I see the point you're making, but I'm adding, this is similar to how we discuss compilers and interpreters. You're pointing out a definition of "validator" that functions like an interpreter: there's a library that reads the schema (the source code), then uses this interpretation to validate JSON. But you can also compile source code to a program, and run the program directly. In this paradigm, there is no interpreter (what is usually called the validator), but the compiled program is still a "validator" (a thing that performs validation). It just has no concept of a schema (any more than a compiled C program can parse C). So with JSON Schema, the schema is not the validator (as such), but I think you can say it describes a validator. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see where you're coming from, but never in my experience with this project have we used "validator" that way. It has always been used to mean the implementation. At best, this reads weird. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Julian If I compile a schema or curry away the schema argument, leaving an executable that only reads an instance, what terminology should we use for the compiler, and the program/function it outputs? I ask because in my opinion, I think the function that accepts the instance would be the "validator", not the compiler. And I argue this usage is entirely consistent with most "validator" libraries that are more like interpreters (they both parse the schema, and validate instances, in a single package). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't believe we need terminology for such a concept in the spec at all (and certainly not at this point in time). What we use today is fine, "implementation", which refers to the executable program capable of doing things with schemas. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This I agree with, but it doesn't follow from this that the schema is a validator. The schema is still just "configuration" (if you want to call it that. It still goes through a library/application, and you get an output. It's just that your example also produces an intermediate output of an executable function that represents a specific schema. The system is inputting the JSON Schema (most likely as JSON or YAML text) and an instance and getting out whether the instance is valid according to that schema. That "compile" step is an intermediate implementation detail that doesn't need to be covered in the spec. The spec needs to concern itself with one thing:
Anything an implementation does to get from input to output is necessarily beyond the scope of the spec. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I see, this isn't what I intended to convey. By saying "the schema describes a validator" I think that would disconnect the schema (the description) from the validator (the actual process). Is a different word is in order here, or some additional explanation ("the schema describes the behavior of a validator")? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it's necessary to say that at all. A schema describes a set of constraints and annotations that can be applied to an instance. That's it. There's no need to bring in implementations of any form. |
||
It supports "structural validation" (context-free grammars), and certain more complicated conditions. | ||
Validation follows JSON semantics, so two documents that are value-equal, but vary only by character escapes, property ordering, or whitespace, will validate with the same result. | ||
</t> | ||
<t> | ||
JSON Schema uses keywords to assert constraints on JSON instances or annotate those | ||
instances with additional information. Additional keywords are used to apply | ||
assertions and annotations to more complex JSON data structures, or based on | ||
some sort of condition. | ||
With respect to a given schema, an input document accepted by that schema is called an "instance." | ||
A JSON Schema may be used to specify sets of JSON documents, by referring to the set of all possible instances of that schema. | ||
</t> | ||
<t> | ||
To facilitate re-use, keywords can be organized into vocabularies. A vocabulary | ||
consists of a list of keywords, together with their syntax and semantics. | ||
A dialect is defined as a set of vocabularies and their required support | ||
identified in a meta-schema. | ||
A condition for accepting a document is called an "assertion". | ||
Assertions impose constraints that instances must conform to. | ||
Given a schema and an instance, the schema "accepts" an input whenever all the assertions are met, | ||
and the schema "rejects" when any of the assertions fail. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "rejects" needs an object, i.e. what is being rejected? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The input JSON document, as was mentioned in 'the schema "accepts" an input whenever...' There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, but grammatically, you need to repeat the object. |
||
Schemas without any assertions accept all JSON documents. | ||
</t> | ||
<t> | ||
JSON Schema can be extended either by defining additional vocabularies, | ||
or less formally by defining additional keywords outside of any vocabulary. | ||
Unrecognized individual keywords simply have their values collected as annotations, | ||
while the behavior with respect to an unrecognized vocabulary can be controlled | ||
when declaring which vocabularies are in use. | ||
Assertions are encoded into a JSON Schema using "keywords," described below. | ||
</t> | ||
</section> | ||
|
||
<section title="Annotation"> | ||
<t> | ||
This document defines a core vocabulary that MUST be supported by any | ||
implementation, and cannot be disabled. Its keywords are each prefixed | ||
with a "$" character to emphasize their required nature. This vocabulary | ||
is essential to the functioning of the "application/schema+json" media | ||
type, and is used to bootstrap the loading of other vocabularies. | ||
A schema may also describe an "annotator," a way to read an instance and output a set of "annotations." | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the schema describing an annotator? (same as "validator" above) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, similar situation, I have a schema, and I want to use it to compile a program that takes a JSON input and returns an output format. It's not otherwise configurable, maybe this is an HTTP service. What do I call that program? |
||
Annotations can be any output metadata about that instance. | ||
</t> | ||
<t> | ||
Additionally, this document defines a RECOMMENDED vocabulary of keywords | ||
for applying subschemas conditionally, and for applying subschemas to | ||
the contents of objects and arrays. Either this vocabulary or one very | ||
much like it is required to write schemas for non-trivial JSON instances, | ||
whether those schemas are intended for assertion validation, annotation, | ||
or both. While not part of the required core vocabulary, for maximum | ||
interoperability this additional vocabulary is included in this document | ||
and its use is strongly encouraged. | ||
For example, you can document the meaning of a property, | ||
suggest a default value for new instances, | ||
generate a list of hyperlinks from the instance, | ||
or declare relationships between data. | ||
Applications may make use of annotations to query for arbitrary information; | ||
for example, to extract a list of names from a document with a known structure. | ||
Annotations may also describe values within the instance in a standard way; | ||
for example, extracting a common type of hyperlink from many different types of documents, using a different schema for type. | ||
</t> | ||
<t> | ||
Further vocabularies for purposes such as structural validation or | ||
hypermedia annotation are defined in other documents. These other | ||
documents each define a dialect collecting the standard sets of | ||
vocabularies needed to write schemas for that document's purpose. | ||
Like assertions, the instructions for producing annotations are encoded in a schema using keywords. | ||
Output is only defined over valid instances, | ||
so annotations are not returned until the input has been validated. | ||
However, not all valid input is meaningful or true to a given application. | ||
That is, if you process an arbitrary instance with nonsense data, | ||
the resulting annotations may not necessarily be true, even though the input is valid. | ||
Comment on lines
+170
to
+172
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The use of "true" here is odd. What does it mean for an input to be "true" to an application? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I struggled a bit with how to phrase this. I'm trying to explain the phenomenon of "garbage in garbage out" and that the assertions don't have to be 100% completely defined. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think dropping "true" and sticking with "meaningful" is the right way here. |
||
</t> | ||
</section> | ||
|
||
|
@@ -394,6 +388,42 @@ | |
</t> | ||
</section> | ||
<section title="Schema Vocabularies"> | ||
<t> | ||
To facilitate re-use, keywords can be organized into vocabularies. A vocabulary | ||
consists of a list of keywords, together with their syntax and semantics. | ||
A dialect is defined as a set of vocabularies and their required support | ||
identified in a meta-schema. | ||
</t> | ||
<t> | ||
JSON Schema can be extended either by defining additional vocabularies, | ||
or less formally by defining additional keywords outside of any vocabulary. | ||
Unrecognized individual keywords simply have their values collected as annotations, | ||
while the behavior with respect to an unrecognized vocabulary can be controlled | ||
when declaring which vocabularies are in use. | ||
</t> | ||
<t> | ||
This document defines a core vocabulary that MUST be supported by any | ||
implementation, and cannot be disabled. Its keywords are each prefixed | ||
with a "$" character to emphasize their required nature. This vocabulary | ||
is essential to the functioning of the "application/schema+json" media | ||
type, and is used to bootstrap the loading of other vocabularies. | ||
</t> | ||
<t> | ||
Additionally, this document defines a RECOMMENDED vocabulary of keywords | ||
for applying subschemas conditionally, and for applying subschemas to | ||
the contents of objects and arrays. Either this vocabulary or one very | ||
much like it is required to write schemas for non-trivial JSON instances, | ||
whether those schemas are intended for assertion validation, annotation, | ||
or both. While not part of the required core vocabulary, for maximum | ||
interoperability this additional vocabulary is included in this document | ||
and its use is strongly encouraged. | ||
</t> | ||
<t> | ||
Further vocabularies for purposes such as structural validation or | ||
hypermedia annotation are defined in other documents. These other | ||
documents each define a dialect collecting the standard sets of | ||
vocabularies needed to write schemas for that document's purpose. | ||
</t> | ||
<t> | ||
A schema vocabulary, or simply a vocabulary, is a set of keywords, | ||
their syntax, and their semantics. A vocabulary is generally organized | ||
|
@@ -1357,7 +1387,7 @@ | |
specification and the companion Validation specification. | ||
</t> | ||
</section> | ||
<section title="Non-inheritability of vocabularies "> | ||
<section title="Non-inheritability of vocabularies"> | ||
<t> | ||
Note that the processing restrictions on "$vocabulary" mean that | ||
meta-schemas that reference other meta-schemas using "$ref" or | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "accept"/"reject" terminology is new. I see you use it later in the PR as well, but it's not used throughout the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's new to this spec, but it is used widely outside JSON Schema and may help new readers understand what is going on. I'm going to suggest we should use accept/reject more often (it greatly simplifies the phrasing of many sentences), but that'll be an issue for later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove that language from this PR and open an issue for that change, please?
I'm not opposed to it, but I think vernacular should be an agreed-upon change, not something that's just snuck in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well my point is there's a certain segment who may see our language as new, and "accepts" is the existing term they're familiar with. I think we should use a variety of language to introduce and define the concepts, and then we can use our choice of term for the rest of the document. Is there a problem with this line of thinking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a problem with introducing them, but this PR doesn't seem the place for it. I'd like to get the opinions of the other maintainers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, though my argument is that not every part of the intro has to be helpful to everyone; it has to be written so that the widest possible audience will understand what JSON Schema accomplishes for them.
The two biggest audiences, I think, will be application developers ("I want a DSL for checking JSON, instead of doing it in code") and formal grammars ("I know what ABNF and DTDs are, I want this for JSON").
I think you'll find that other similar technology uses technical terms much more heavily than I'm suggesting we do.
I looked at the introduction for ABNF, which I found far too technical for most people to understand. It talks in technical terms that it's a formal syntax, but doesn't really describe why you'd want to use it at all, or use it over other languages.
XML DTD also talks about formal grammars, validators, and uses the accepts/rejects terminology; but it too is somewhat technical and it's not immediately obvious to me who the target audience is.
So what I'm looking for is (1) should the formal grammar audience be accommodated in the introduction? (Since ABNF and DTDs both seem to be written exclusively for this audience, I would suggest this is important.)
And (2) if we should accommodate the formal grammar audience, is there a better way to write it so that it's more helpful for them, and less confusing to others?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a code review, where saying "I don't find it helpful" is to say "I believe you should not add this, it isn't helpful to a wider audience", not simply offering my own anecdote about my personal reading.
Section 2.8 of a document is wildly different from being literally the first paragraph of the actual content of the document. I also don't see the "accepts/rejects" terminology in the section you linked. It uses "valid", as we already do.
You already have my own opinion, now three times: no, we should not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I ask because saying "I don't find it helpful" is suggestive of a personal opinion without projecting what others will think; saying "I don't believe this will be helpful" is a general observation of the sort I'm looking for.
I'm going to have to think about what else to say, if it's not immediately obvious that formal grammars are related here, as that's the formal study of what JSON Schema is fundamentally doing.
XML does not use the term "validates" (in the third person singular) to refer to an outcome (and actually it doesn't use it in that form at all). It uses "validate" to describe a process, "accept"/"matches", and "reject" to describe outcomes of that process, and "valid" to describe documents that have been accepted by the process, but nothing like "validates successfully" as we do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the risk of quoting myself, the comment I left before that was quite clear on which I was intending, please don't ignore it:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to bow out of this PR as well, as I've I believe communicated I'm -1 on the changes in their current form, and that there might be smaller changes that I'm more positive on but that they're sufficiently far away from this PR in its current state that it's not a matter of rewording a small bit here and there. It bears repeating I suppose that that's just my vote, and others may disagree of course, though obviously I've landed on this PR after Greg sounds like he was expressing similar doubts.