-
-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructuring of output formatting #973
Comments
Recently, we worked on formalizing the process for embedding schemas of different drafts than the parent schema. If the output format is changed, how would output for embedded schemas work? Currently, it's not a problem to just use 2019-09 for everything because none of the other drafts specify an output format. I think it should definitely not change output format mid-validation. The output format of the top level schema should be used for the whole validation. I don't think this is controversial, but it is a detail that we'll have to address as we make changes to the output format. |
Not sure if I should be piling on here or creating another issue, but this seems like a good time to bring up something that bothers me about the output format. Currently I can see the value of this keyword for human debugging, so I'm not proposing throwing it out entirely, but it should at least be optional and not a URI (just a pointer). In my implementation, I don't include |
I don't think the output format should trim anything. Nodes that are superfluous in one context might be helpful in another. Producing human readable output should be a process that takes the output as an argument and transforms it into something nice. https://github.com/atlassian/better-ajv-errors is an example of this concept except it uses the ajv proprietary output format rather than the standard one. I'm hoping they will come out with a version that supports the standard output format so I don't have to make my own. If we trim the output results to improve one way of consuming the output, we might be limiting what can be done for other ways of consuming output. For example, imagine a regex101.com -like online schema validator. It might find some of those "redundant" nodes useful to build visualizations where they are just noise for text-based human reporting. Therefore, I don't think the spec should be defining an algorithm to reduce returned nodes. We can leave that to third party tools to decide what works in their domain. It's definitely a problem we as implementors need to work on solving, but not at the spec level. |
This is correct as is. You will always have a location relative to the root schema, which is what Additionally,
This is why we're discussing it. A lot of times, you just want the problem. Other times, you'd like the details. We need to isolate the rules around these scenarios. @ssilverman has some logic that he's implemented in his library already that may serve as a starting point. |
In my implementation, I omit absoluteKeywordLocation from the error/annotation output iff it is identical to keywordLocation (that is, if no $ref was traversed while getting to this location, AND there is no canonical URI provided for the schema) -- I agree that it is helpful to have both pieces of information. |
I think you're missing the point. I'm saying that the output format should be like an assembly language. It's a standardized, low level representation of the validation result. Starting from that assembly language, people can build whatever they need that is appropriate for their domain. Since it's impossible for us the predict every way someone might want to use the output format, I think this is the only feasible path.
This sounds like a great argument for why an id should not be optional (not necessarily with I'd love to see |
That your scenario requires an ID doesn't imply that every scenario will. Relative location has its importance in how you got to the keyword, not just where the keyword is. This is why
I'm just saying that it needs to be discussed. I'm open to discussion, and some of your points are valid. Let's please keep the tone of this conversation open. If you have structural suggestions, please share them. Also keep in mind the months of conversational background that led to the current state. There are requirements that we were dealing with that your use case may not have.
This is precisely the approach we took. What is now in the spec evolved from that. |
Why is PS. @gregdennis in your original problem description above, can you clarify which output format you are discussing? |
The point is that the output format needs to be a common denominator for any application that someone might want to use it for. If my use case needs it, others will too. If the output format doesn't require it, my use case and many others won't be compatible with standard validators.
I'm not sure what you mean by that, but if any of what I've written has come across as rude or aggressive or inappropriate in any way, I apologize. It truly was not my intention. I know I can be suborn, but I only persist when I'm not getting my point across. If we understand each other and still don't agree on a resolution, I'm happy to drop it.
I followed along with the conversation and I have the utmost respect for all the effort everyone put in. At the time, I wasn't in a place to contribute to that particular conversation. I'm possibly the first person to try to do something with the output other than reporting validation results. I bring up my use case not to make this about me, but to show an example of where the current output format is insufficient. I think that's valuable feedback, but if it's not helpful, I'm happy to drop it. |
This question isn't saying that I'm asking for an overhaul of the output. I think the structure of the node as well as the tree structure (of each format) of the output should be revisited. Question 1 looks at the node structure, whereas questions 2 & 3 look more at the tree structure. |
I don't think this is true in the general case. For instance, in case of a recursive schema, just knowing the absolute location (definition site) of the keyword which failed doesn't let you know at which depth of recursion it was reached. This could be perhaps inferred from the tree structure, but this seems much more error-prone (and non-serializable, meaning you can't just pass a node somewhere else in the program without losing this information). In this vein you might just as well argue that the I'd definitely leave the |
|
Throwing thoughts in:
|
I like the nomenclature of "terminal" and "non-terminal". Your second point is my first in the original post. |
In a Slack discussion with @karenetheridge, we figured out that we also need a way to identify when a Normally, it's keywords that are generating validation errors. So pointers to the error source generally end in keywords, e.g. {
"properties": {
"falseProp": false
}
} There is no keyword here. The source of the error is the @karenetheridge suggested using a pointer to the property and a message like "subschema is false." It should be fairly clear that it's the In JsonSchema.Net, I've used a special I don't particularly have strong feels on how, but this is a special case that the spec needs to clarify. |
Regarding false schemas ... In my implementation, validation is implemented as just another keyword. It therefore has a node in the validation output just like any other keyword. The top level node is a "validate" keyword and so is any sub-schema that's involved in the validation. Therefore, it is a keyword that generates the validation error even when the schema is Here's the output from the "falseProp" schema from the previous comment. Notice that the thing that fails is validating the schema but it's errors array is empty because there are no keywords in that schema. {
"keyword": "https://json-schema.org/draft/2019-09/schema#validate",
"absoluteKeywordLocation": "https://json-schema.hyperjump.io/schema1#",
"instanceLocation": "#",
"valid": false,
"errors": [
{
"keyword": "https://json-schema.org/draft/2019-09/schema#properties",
"absoluteKeywordLocation": "https://json-schema.hyperjump.io/schema1#/properties",
"instanceLocation": "#",
"valid": false,
"errors": [
{
"keyword": "https://json-schema.org/draft/2019-09/schema#validate",
"absoluteKeywordLocation": "https://json-schema.hyperjump.io/schema1#/properties/falseProp",
"instanceLocation": "#/falseProp",
"valid": false,
"errors": []
}
]
}
]
} This approach is also useful with the { "not": false } {
"keyword": "https://json-schema.org/draft/2019-09/schema#validate",
"absoluteKeywordLocation": "https://json-schema.hyperjump.io/schema1#",
"instanceLocation": "#",
"valid": true,
"errors": [
{
"keyword": "https://json-schema.org/draft/2019-09/schema#not",
"absoluteKeywordLocation": "https://json-schema.hyperjump.io/schema1#/not",
"instanceLocation": "#",
"valid": true,
"errors": [
{
"keyword": "https://json-schema.org/draft/2019-09/schema#validate",
"absoluteKeywordLocation": "https://json-schema.hyperjump.io/schema1#/not",
"instanceLocation": "#",
"valid": false,
"errors": []
}
]
}
]
} |
@jdesrosiers, I think you and @karenetheridge are following the same concept regarding identifying the |
I think it's worth defining a practice for the |
Regarding schemas that are {
"properties": {
"falseProp": false
}
} The location of that schema is (using JSON pointer notation) "/properties/falseProp". Yes, "falseProp" isn't a keyword, but it's certainly the JSON location of that schema. i.e. it's the "schema container name". |
@ssilverman I wasn't saying that we do need it. That's what my implementation does to identify that the error is coming from the It seems that the consensus is that we merely need to report up to but not including the |
Some more food for thought: Instead of calling the error output locations "keywords", why not just use "schemaLocation" and "absoluteSchemaLocation" instead? This way, we won't get stuck on these locations actually pointing to "keywords". For example, we'd have: "schemaLocation": "/properties/falseProp",
"absoluteSchemaLocation": "scheme://host/example.json#/properties/falseProp" From: {
"properties": {
"falseProp": false
}
} I know I was stuck on restricting my output to only keywords, but in fact, it turns out that's not useful, especially for non-keywords wrapping a schema (in this example; "falseProp" isn't a keyword, yet it's useful to describe the error location). [Update] |
I suppose my solution is a middle ground. Every node in the output represents a keyword, but in the case of the pseudo-keyword "validate", the name of the keyword doesn't match the location. "falseProp" is treated like an alias for "validate". So, |
FWIW, I have a 'normalize_evaluation_result' method in my $work application that does exactly that -- the structure returned to the user in error responses uses the properties "data_location", "schema_location" and "absolute_schema_location" :) |
I think it's worth summarizing the tweaks we have so far (or at least what I have in my head):
Also, the algorithms to determine which nodes need to be kept for the In my implementations, it was easiest to build the full verbose output structure, then pare down using a set of rules. I had tried to summarize those rules in the spec, but I think it's a bit hard to follow outside of the context of reading my code.
* The fact that we have a special case in the behavior of |
It was suggested in Slack by @ssilverman that following the schema structure is not necessarily useful in cases where authoring is complete and the schema is being used to validate instances. He suggests that following the instance structure may be more useful in these cases. Arguably these cases are more common since authorship ideally occurs once. The primary benefit of such a structure would be that all errors and annotations that person to a specific instance location would be collected together. I want to review our original decision to follow the agenda rather than our instance, but I think it was motivated from a point of view of authorship and debugging of the schema itself rather than repeatedly validating varying instances as would happen in a live system and one might expect from an online validator. |
In #1065 @jdesrosiers suggests fully identifying keywords in the output format, and I suggest doing so by using the keyword name as a fragment on the URI of the vocabulary. @gregsdennis mentioned that this should get noted here, so... noted! |
I'm starting to look into actually updating this section, and one of the first things I'd like to do is work out an instance-based format. This is what I've come up with so far. For convenience, this is the example in the spec: // schema
{
"$id": "https://example.com/polygon",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$defs": {
"point": {
"type": "object",
"properties": {
"x": { "type": "number" },
"y": { "type": "number" }
},
"additionalProperties": false,
"required": [ "x", "y" ]
}
},
"type": "array",
"items": { "$ref": "#/$defs/point" },
"minItems": 3
}
// instance
[
{
"x": 2.5,
"y": 1.3
},
{
"x": 1,
"z": 6.7
}
] Errors are:
Instance-based output would look something like this: [
{ "valid": true },
{
"x": { "valid": true },
"y": {
"keywordLocation": "/items/$ref/required",
"absoluteKeywordLocation": "https://example.com/polygon#/$defs/point/required",
"error": "Required property 'y' not found."
},
"z": {
"keywordLocation": "/items/$ref/additionalProperties",
"absoluteKeywordLocation": "https://example.com/polygon#/$defs/point/additionalProperties",
"error": "Additional property 'z' found but was invalid."
}
},
{
"valid": false,
"keywordLocation": "/minItems",
"instanceLocation": "",
"error": "Expected at least 3 items but found 2"
}
] So the idea here is that any values will be replaced by the error(s). It's not shown here, but a single location can have multiple errors, and these would be collected into an array, whereas a single error doesn't necessarily need an enclosing array. Optionally the There's also an added item for I think a format like this could be quite useful to anything that uses schema to generate or validate input from forms. Thoughts? |
@gregsdennis I was working on something similar a while ago. I can't remember why, but I found trying to mimic the structure of the instance to be problematic. What I came up with instead was a simple 2D Map from location pointer to keyword to results. Given an instance location, you can easily get all the annotations that apply to that location or all the annotations for a certain keyword that apply to that location. This approach isn't fully vetted either and might have problems as well, but I'm sharing in case it's useful. Here's what your example would look like. {
"#": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": false }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#items": [{ "valid": false }],
"https://json-schema.org/draft/2020-12/schema#minItems": [{ "valid": false }]
},
"#/0": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#properties": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#additionalProperties": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#required": [{ "valid": true }]
},
"#/0/x": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }]
},
"#/0/y": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }]
},
"#/1": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": false }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#properties": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#additionalProperties": [{ "valid": false }],
"https://json-schema.org/draft/2020-12/schema#required": [{ "valid": false }]
},
"#/1/x": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }]
},
"#/1/z": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": false }]
}
} I'm using URIs to identify keywords rather than just the name to distinguish between different drafts. I've used a simple |
I've found this, too. In mimicking any structure, we have to end up modifying it to inject the error data. It awkward at best. I like what you have. I think the keywords as URIs is unnecessary, but I understand where you're coming from and I think we need a separate issue for it. I don't see how a single keyword can generate multiple errors for a single instance location, though. And what is the Also, this seems more of a variation on the current "basic" output, which is a flat list of errors. In conversations with people who wanted an instance-based structure when we were initial considering a formalized output, it was clear they wanted a hierarchical format, just that it followed the instance. |
I don't mean to derail the conversation with my extensions. This was not a proposal. I'm just showing what I've done. Feel free to incorporate as little or as much as you like into your proposal.
See #1065.
Not sure if you missed the explanation or it wasn't clear ...
I think you can safely ignore it. It's really useful for me, but not standard. |
Oh, no. At this point, I'm open to discussing how else we can represent errors. What we have is the best we could come up with before, and it's received.... criticism... about its usefulness in certain scenarios. I just want to address that. Keep the ideas coming.
I see your point, but this illustrates to me that maybe it's not necessary to group by keyword within this format. I'd see these two "title" keywords as separate because they're in different subschemas of the
🤦 So this is like a summary for this instance location, then? |
Not entirely. There will be a result for each schema validated against that location. For example, {
"allOf": [
{ "title": "foo" },
{ "title": "bar" }
]
} {
"#": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": true }, { "valid": true }, { "valid": true }],
"https://json-schema.org/draft/2020-12/schema#allOf": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#title": [{ "valid": true }, { "valid": true }]
}
} This has three |
I don't understand the purpose of having that. It seems like a lot of repeated information. |
Nothing is repeated, it's just really thorough. Here's a slightly more useful example. {
"allOf": [
{ "type": "number" },
{ "maximum": 5 }
]
} 6 {
"#": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": false }, { "valid": true }, { "valid": false }],
"https://json-schema.org/draft/2020-12/schema#allOf": [{ "valid": false }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#maximum": [{ "valid": false }]
}
}
Here's an example of how I was finding the {
"type": "object",
"properties": {
"foo": { "type": "string" }
}
} { "foo": "bar" } {
"#": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#properties": [{ "valid": true }]
},
"#/foo": {
"https://json-schema.org/draft/2020-12/schema#validate": [{ "valid": true }],
"https://json-schema.org/draft/2020-12/schema#type": [{ "valid": true }]
}
} We know the instance is valid. If we then modify the "foo" property to "baz" and want to revalidate, it would be nice if we didn't have to revalidate everything. We just want to revalidate what as changed. I can use the results to identify the sub-schema to run validation against the new value ("baz"). |
Hey everyone. I've been reading through all of the comments in this and linked issues, compiling a list of topics. It seems that we have consensus on these things:
These things still require decisions (but I've proposed some things):
And this isn't directly about output, but supports it:
AnnotationsAnnotations need to be part of the node, not rendered as child validation nodes. I don't think there's any argument here. I've been thinking around this for a couple hours now, and it seems we have three orthogonal attributes that we need to consider. 1. Does the keyword provide validation?Not all keywords provide validation. For example, the spec says nothing about validation for My implementation handles this by always passing validation for these keyword, which works to provide the correct pass/fail outcome, but it also means that they produce an output node. So I get something like this as a node: {
"valid": true,
"keywordLocaation": "#/description",
"schemaLocation": "https://example.com/mySchema#/description",
"instanceLocation": "",
"annotation": "description"
} Question: should pure-annotation keywords produce an output node or simply be listed in parent nodes? I can see arguments for both sides, and I think I have a pretty easy tweak to my implementation that would hide these, so I'm not invested either way. 2./3. Does the keyword produce and/or collect annotations?These are actually two different ideas, but I think I often conflate them, so I'm going to consider them distinct here. A keyword can produce annotations, and it can propagate annotations from its children. Most keywords do one or the other, but some do both.
Regardless of how an implementation manages annotations internally, I believe it's worthwhile keeping these distinctions separate when reporting them to the user. It can be useful to know whether a keyword produced an annotation or is just passing a message. I'd like to propose two more output node properties to cover each of these ideas:
The downside to listing collected annotations is duplication of annotation values. But on the upside, you can:
Examples for existing output formatsThese examples take the spec scenario and show what the output would look like if we changed all of this, including my suggestions on the open discussion topics. For reference, here's the spec scenario: // schema
{
"$id": "https://example.com/polygon",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$defs": {
"point": {
"type": "object",
"properties": {
"x": { "type": "number", "description": "x coordinate" },
"y": { "type": "number", "description": "y coordinate" }
},
"additionalProperties": false,
"required": [ "x", "y" ]
}
},
"type": "array",
"items": { "$ref": "#/$defs/point" },
"minItems": 3
}
// instance
[
{
"x": 2.5,
"y": 1.3
},
{
"x": 1,
"z": 6.7
}
] Flag (no change) 🎉{ "valid": true } BasicClick to expand!{
"valid": false,
"validationPath": "",
"schemaLocation": "https://example.com/polygon#",
"instanceLocation": "",
"nested": [
{
"valid": false,
"keyword": "required",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/required",
"schemaLocation": "https://example.com/polygon#/$defs/point/required",
"instanceLocation": "/1",
"error": "Required property 'y' not found."
},
{
"valid": false,
"keyword": "additionalProperties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/additionalProperties",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties",
"instanceLocation": "/1/z",
"error": "Additional property 'z' found but was invalid."
},
{
"valid": false,
"keyword": "minItems",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/minItems",
"schemaLocation": "/minItems",
"instanceLocation": "",
"error": "Expected at least 3 items but found 2"
}
]
} (I never really liked having a top-level node here, but I don't know how else to collect the nodes. I do like consistently having an object at the root, as opposed to an array.) Detailed (schema-based)Click to expand!{
"valid": false,
"validationPath": "",
"schemaLocation": "https://example.com/polygon#",
"instanceLocation": "",
"nested": [
{
"valid": false,
"keyword": "$ref",
"dialect": "https://json-schema.org/draft/2020-12/vocab/core",
"validationPath": "/items/$ref",
"schemaLocation": "https://example.com/polygon#/$defs/point",
"instanceLocation": "/1",
"nested": [
{
"valid": false,
"keyword": "required",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/required",
"schemaLocation": "https://example.com/polygon#/$defs/point/required",
"instanceLocation": "/1",
"error": "Required property 'y' not found."
},
{
"valid": false,
"keyword": "additionalProperties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/additionalProperties",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties",
"instanceLocation": "/1/z",
"error": "Additional property 'z' found but was invalid."
}
]
},
{
"valid": false,
"keyword": "minItems",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/minItems",
"schemaLocation": "/minItems",
"instanceLocation": "",
"error": "Expected at least 3 items but found 2"
}
]
} Verbose (schema-structured)Click to expand! (this one's pretty big){
"result": {
"valid": false,
"validationPath": "",
"schemaLocation": "https://example.com/polygon#",
"instanceLocation": "",
"nested": [
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/type",
"schemaLocation": "https://example.com/polygon#/type",
"instanceLocation": ""
},
{
"valid": false,
"keyword": "minItems",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/minItems",
"schemaLocation": "https://example.com/polygon#/minItems",
"instanceLocation": "",
"error": "Value has fewer than 3 items"
},
{
"valid": false,
"keyword": "items",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicators",
"validationPath": "/items",
"schemaLocation": "https://example.com/polygon#/items",
"instanceLocation": "",
"nested": [
{
"valid": true,
"keyword": "items",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items",
"schemaLocation": "https://example.com/polygon#/items",
"instanceLocation": "/0",
"nested": [
{
"valid": true,
"keyword": "$ref",
"dialect": "https://json-schema.org/draft/2020-12/vocab/core",
"validationPath": "/items/$ref",
"schemaLocation": "https://example.com/polygon#/items/$ref",
"instanceLocation": "/0",
"nested": [
{
"valid": true,
"validationPath": "/items/$ref",
"schemaLocation": "https://example.com/polygon#/$defs/point",
"instanceLocation": "/0",
"nested": [
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/type",
"instanceLocation": "/0"
},
{
"valid": true,
"keyword": "properties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/properties",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties",
"instanceLocation": "/0",
"annotation": [
"x",
"y"
],
"collectedAnnotations": {
"https://example.com/polygon#/$defs/point/properties/x/description": "x coordinate",
"https://example.com/polygon#/$defs/point/properties/y/description": "y coordinate"
},
"nested": [
{
"valid": true,
"validationPath": "/items/$ref/properties/x",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x",
"instanceLocation": "/0/x",
"nested": [
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/properties/x/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x/type",
"instanceLocation": "/0/x"
},
{
"valid": true,
"keyword": "description",
"dialect": "https://json-schema.org/draft/2020-12/vocab/meta-data",
"validationPath": "/items/$ref/properties/x/description",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x/description",
"instanceLocation": "/0/x",
"annotation": "x coordinate"
}
]
},
{
"valid": true,
"validationPath": "/items/$ref/properties/y",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/y",
"instanceLocation": "/0/y",
"nested": [
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/properties/y/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/y/type",
"instanceLocation": "/0/y"
},
{
"valid": true,
"keyword": "description",
"dialect": "https://json-schema.org/draft/2020-12/vocab/meta-data",
"validationPath": "/items/$ref/properties/y/description",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/y/description",
"instanceLocation": "/0/y",
"annotation": "y coordinate"
}
]
}
]
},
{
"valid": true,
"keyword": "additionalProperties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/additionalProperties",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties",
"instanceLocation": "/0",
"annotation": [],
"collectedAnnotations": {
"https://example.com/polygon#/$defs/point/properties": [
"x",
"y"
]
}
},
{
"valid": true,
"keyword": "required",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/required",
"schemaLocation": "https://example.com/polygon#/$defs/point/required",
"instanceLocation": "/0",
"collectedAnnotations": {
"https://example.com/polygon#/$defs/point/properties": [
"x",
"y"
],
"https://example.com/polygon#/$defs/point/additionalProperties": []
}
}
]
}
]
}
]
},
{
"valid": false,
"keyword": "items",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items",
"instanceLocation": "/1",
"nested": [
{
"valid": false,
"keyword": "$ref",
"dialect": "https://json-schema.org/draft/2020-12/vocab/core",
"validationPath": "/items/$ref",
"schemaLocation": "https://example.com/polygon#/items/$ref",
"instanceLocation": "/1",
"nested": [
{
"valid": false,
"validationPath": "/items/$ref",
"schemaLocation": "https://example.com/polygon#/$defs/point",
"instanceLocation": "/1",
"nested": [
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/type",
"instanceLocation": "/1"
},
{
"valid": true,
"keyword": "properties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/properties",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties",
"instanceLocation": "/1",
"annotation": [
"x"
],
"collectedAnnotations": {
"https://example.com/polygon#/$defs/point/properties/x/description": "x coordinate"
},
"nested": [
{
"valid": true,
"validationPath": "/items/$ref/properties/x",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x",
"instanceLocation": "/1/x",
"nested": [
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/properties/x/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x/type",
"instanceLocation": "/1/x"
},
{
"valid": true,
"keyword": "description",
"dialect": "https://json-schema.org/draft/2020-12/vocab/meta-data",
"validationPath": "/items/$ref/properties/x/description",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x/description",
"instanceLocation": "/1/x",
"annotation": "x coordinate"
}
]
}
]
},
{
"valid": false,
"keyword": "additionalProperties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/additionalProperties",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties",
"instanceLocation": "/1",
"nested": [
{
"valid": false,
"validationPath": "/items/$ref/additionalProperties/$false",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties/$false",
"instanceLocation": "/1/z",
"error": "All values fail against the false schema"
}
]
},
{
"valid": false,
"keyword": "required",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/required",
"schemaLocation": "https://example.com/polygon#/$defs/point/required",
"instanceLocation": "/1",
"error": "Required properties [y] were not present"
}
]
}
]
}
]
}
]
}
]
}
} Note the inclusion of the annotations from the subschemas which passed validation and how they're no longer included once we move up through a subschema that failed validation. Instance-based structure examplesI think I've been able to marry our two approaches. Considerations:
I do need to add a Verbose (instance-based)Click to expand!{
"valid": false,
"validationPath": "",
"schemaLocation": "https://example.com/polygon#",
"instanceLocation": "",
"results": [
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "#/type",
"schemaLocation": "https://example.com/polygon#/type"
},
{
"valid": false,
"keyword": "minItems",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "#/minItems",
"error": "Value has fewer than 3 items"
},
{
"valid": false,
"keyword": "items",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicators",
"validationPath": "#/items"
}
],
"nested": {
"/0": {
"valid": true,
"results": [
{
"valid": true,
"keyword": "items",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items",
"schemaLocation": "https://example.com/polygon#/items",
},
{
"valid": true,
"keyword": "$ref",
"dialect": "https://json-schema.org/draft/2020-12/vocab/core",
"validationPath": "/items/$ref",
"schemaLocation": "https://example.com/polygon#/items/$ref",
},
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/type",
},
{
"valid": true,
"keyword": "properties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/properties",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties",
"annotation": [
"x",
"y"
],
"collectedAnnotations": {
"https://example.com/polygon#/$defs/point/properties/x/description": "x coordinate",
"https://example.com/polygon#/$defs/point/properties/y/description": "y coordinate"
}
},
{
"valid": true,
"keyword": "additionalProperties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/additionalProperties",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties",
"annotation": [],
"collectedAnnotations": {
"https://example.com/polygon#/$defs/point/properties": [
"x",
"y"
]
}
},
{
"valid": true,
"keyword": "required",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/required",
"schemaLocation": "https://example.com/polygon#/$defs/point/required",
"collectedAnnotations": {
"https://example.com/polygon#/$defs/point/properties": [
"x",
"y"
],
"https://example.com/polygon#/$defs/point/additionalProperties": []
}
}
],
"nested": {
"/0/x": {
"valid": true,
"results": [
{
"valid": true,
"validationPath": "/items/$ref/properties/x",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x",
},
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/properties/x/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x/type",
},
{
"valid": true,
"keyword": "description",
"dialect": "https://json-schema.org/draft/2020-12/vocab/meta-data",
"validationPath": "/items/$ref/properties/x/description",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x/description",
"annotation": "x coordinate"
}
]
},
"/0/y": {
"valid": true,
"results": [
{
"valid": true,
"validationPath": "/items/$ref/properties/y",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/y",
},
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/properties/y/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/y/type",
},
{
"valid": true,
"keyword": "description",
"dialect": "https://json-schema.org/draft/2020-12/vocab/meta-data",
"validationPath": "/items/$ref/properties/y/description",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/y/description",
"annotation": "y coordinate"
}
]
}
}
},
"/1": {
"valid": false,
"results": [
{
"valid": false,
"keyword": "items",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items",
},
{
"valid": false,
"keyword": "$ref",
"dialect": "https://json-schema.org/draft/2020-12/vocab/core",
"validationPath": "/items/$ref",
"schemaLocation": "https://example.com/polygon#/items/$ref",
},
{
"valid": false,
"validationPath": "/items/$ref",
"schemaLocation": "https://example.com/polygon#/$defs/point",
},
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/type",
},
{
"valid": true,
"keyword": "properties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/properties",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties",
"annotation": [
"x"
],
"collectedAnnotations": {
"https://example.com/polygon#/$defs/point/properties/x/description": "x coordinate"
}
},
{
"valid": false,
"keyword": "additionalProperties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/additionalProperties",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties"
},
{
"valid": false,
"keyword": "required",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/required",
"schemaLocation": "https://example.com/polygon#/$defs/point/required",
"error": "Required properties [y] were not present"
}
],
"nested": {
"/1/x": {
"valid": true,
"results": [
{
"valid": true,
"validationPath": "/items/$ref/properties/x",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x",
},
{
"valid": true,
"keyword": "type",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/properties/x/type",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x/type",
},
{
"valid": true,
"keyword": "description",
"dialect": "https://json-schema.org/draft/2020-12/vocab/meta-data",
"validationPath": "/items/$ref/properties/x/description",
"schemaLocation": "https://example.com/polygon#/$defs/point/properties/x/description",
"annotation": "x coordinate"
}
]
},
"/1/z": {
"valid": false,
"results": [
{
"valid": false,
"validationPath": "/items/$ref/additionalProperties/$false",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties/$false",
"error": "All values fail against the false schema"
}
]
}
}
}
}
} Detailed (instance-based)Initially I was only considering the verbose format for instance-based, but in building that example, I realized how to pare down the fluff. The rules I'm following for this are:
This second rule removes pass-through nodes like the kind you'd get having just resolved a Click to expand!{
"valid": false,
"validationPath": "",
"schemaLocation": "https://example.com/polygon#",
"instanceLocation": "",
"results": [
{
"valid": false,
"keyword": "minItems",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "#/minItems",
"error": "Value has fewer than 3 items"
}
],
"nested": {
"/1": {
"valid": false,
"results": [
{
"valid": false,
"keyword": "required",
"dialect": "https://json-schema.org/draft/2020-12/vocab/validation",
"validationPath": "/items/$ref/required",
"schemaLocation": "https://example.com/polygon#/$defs/point/required",
"error": "Required properties [y] were not present"
}
],
"nested": {
"/1/z": {
"valid": false,
"results": [
{
"valid": false,
"keyword": "additionalProperties",
"dialect": "https://json-schema.org/draft/2020-12/vocab/applicator",
"validationPath": "/items/$ref/additionalProperties",
"schemaLocation": "https://example.com/polygon#/$defs/point/additionalProperties/$false",
"error": "All values fail against the false schema"
}
]
}
}
}
}
} |
@gregsdennis That's a lot! I'm going to avoid pulling on too many threads at once. So, I'll start with the consensus list. Here's a couple notes. For any of the items I don't mention, you can assume I'm in agreement with.
I don't know what this means. Can you provide a little more explanation.
I don't think |
Fair point. I can go with that. I'll update above. |
Working branch is draft-next-output. |
Unless I'm missing some contradicting wording somewhere else,
|
🤔 yeah, I see what you're getting at. I'll do some digging to see where we decided that. Do you have an opinion on this requirement? |
Looks like the most relevant conversations are: |
I don't have a strong opinion. I'd suggest that implementations MUST support |
Given that I'm expanding I do like requiring |
I agree that we probably shouldn't prefer one over the other. So, I'd suggest: MUST support |
I haven't had a chance to go over all of this in detail and respond to all the points that I want to respond to -- but I wonder if it might be worthwhile moving to a community discussion so we can have threaded conversations about the various points? Many of them are not directly related, and some require a deep dive to understand motivations and to examine tradeoffs. (We might need an ADR for these, too!) |
Sounds good. I'll copy this to a discussion, making edits for the conversation @jdesrosiers and I have had so far. We can create threads for each item. |
We don't have GH discussions on in this repo so I opened one in the Community repo with the General topic: json-schema-org/community#63 |
I think this is basically resolved at this point. Closing for now. |
While writing Section 10 on output, I was implementing the formatting in my library, Manatee.Json. In that library, I don't collect annotations in the same way as described in the spec. As a result the annotation output of a successful validation was not fully explored.
I have recently been working on a new validator that does follow the recommendations of the spec in regard to annotation collection and have reached the same questions that many other implementors have asked me regarding annotations in the output format, to which I naively responded that annotation output is analogous to error output.
I aim to fix this. This issue is a compilation of questions and issues I've experienced trying to reimplement output formatting as well as potential solutions to some of them.
1. Annotations vs Nested Results
With the current output, annotations are to be represented as their own output unit within the
annotations
collection, including thevalid
property and all JSON pointers indicating where the annotation came from and to what part of the instance it applies. Additionally, any nested results were also to be in this array. This conflates annotations with nested results and the only way to identify which were annotations and which were nested results was via the location properties.To resolve this, we should split these collections. Nested results should remain as they are, but the property name needs to change.
The proposed fix adds (or hijacks, rather) a single property,
annotations
to the output unit. It's only present forvalid = true
and when there are annotations (no empty arrays). The value is an array of all annotations for this node.For instance,
properties
requires that it adds the names of any validated properties as annotations. To this end, theannotations
on theproperties
node will be an array of these property names.I'm not sure how this fits in with collecting unknown keywords as annotations since that would effectively require a node for each unknown keyword, but maybe that's fine (the location pointer would indicate the keyword, and the annotation value would be the unknown keyword's value).
This idea also may invert the intent behind dropping combining annotations, but I haven't thought through that completely, yet.
This is also in line with @ssilverman's comment regarding how his implementation handles annotations internally. (My new validator uses a similar mechanism.)
2. Nested Nodes
Having the nested result nodes under
annotations
for validated results anderrors
for invalidated results was, admittedly, me trying to be clever about things. Practically, it's difficult to deal with.The proposal here is to have these always under a consistent name, e.g.
nested
,results
, etc. I'm not too fussed about what the name is, but it should always be that regardless of the value ofvalid
.3. Too Many
SecretsNodesIn trying to pare down the list of nodes for the
detailed
andbasic
formats, I found that a lot of nodes were kept that may not be necessary.For example, in an error scenario, is it necessary to list passing subschemas of an
allOf
? To a person just trying to figure out what's wrong, these are just noise to be filtered out in order to pinpoint the one subschema that failed.The algorithm for reducing returned values could be adjusted, and maybe needs to be dynamic based on the scenario. Maybe if a schema is valid, I don't care about any of the details and I just want a ✔️. But for failures I want to know only those nodes that failed, but I want detailed info about them, and maybe in a format that mimicks the schema (as
detailed
is intended to do). The current formats (and the algorithms to generate them) do not support this.(Fake internet points for anyone who gets my reference in the title of this section.)
The text was updated successfully, but these errors were encountered: