Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolve ambiguity regarding annotation inclusion in failed validation results #939

Closed
karenetheridge opened this issue May 24, 2020 · 29 comments
Assignees
Labels

Comments

@karenetheridge
Copy link
Member

karenetheridge commented May 24, 2020

The spec says (https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.7.7.1.2):

Schema objects that produce a false assertion result MUST NOT produce any annotation results, whether from their own keywords or from keywords in subschemas.

However, this is contradicted by other parts of the spec
(https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.10.4.4):

This includes sub-schema validation results that would otherwise be removed (e.g. annotations for failed validations, successful validations inside a not keyword, etc.).

So, it would appear there are cases where annotations should be included in a result even if the
overall validation result is false. (e.g. if one branch of an allOf was successful and produced annotations, but another branch failed, this means that the subschema at allOf fails, so it should discard all annotations collected below that point and simply return its valid: "failed" and errors: [..] results back up to the next-higher node. But in the end, annotations are still not returned in the final result.

This (section 10.4.4) also implies that an implementation cannot simply discard
all collected annotations at a schema position where validation fails: it must continue to collect
them in case the "verbose" output method is requested.


Now, another usecase:

I have a desire to include annotations in failed validation result, so as to provide more
information as to the nature of the failure (or perhaps offer extra guidance as to how to make
validation succeed).·

(cross-ref: https://github.com/karenetheridge/JSON-Schema-Draft201909/issues/22)

I can achieve this with a simple annotation-only keyword (e.g. "description", or a new one that acts
like "description" by returning data at the position it is evaluated), except for this item of the
spec that says that annotations are discarded on failed results.

Is there a way through this? At the very least, it seems as if there is some contradictory language
in the spec that could be clarified.

@ssilverman
Copy link
Member

In my validator, I separate the concept of "collected annotations" from "error collection". I think the "annotations" referred to by section 10.4.4 are in the second, "error collection" set, where there are both "error" and "annotation" annotations representing "false" and "true" results, respectively. At least that's how I'm reading this part.

I agree the language could use work.

@handrews
Copy link
Contributor

My intention was that failed validation always drops annotations, which means you can abandon processing once a schema object fails validation, and ignore its subschemas.

I did not write section 10, so you'll have to ask @gregsdennis about it (if I was involved in that point, I have since forgotten about it).

It's not feasible to fully specify annotations that might-have-been because you get into very complex questions about conditional schema application and exactly how much needs to be faked in order to find might-have-been annotations. That's one reason I didn't require it, in addition to the performance issues.

@handrews
Copy link
Contributor

Section 7.7.2.1 paragraph one is correct as written, any change to the behavior it requires it would require substantial discussion beyond just harmonizing wording.

@karenetheridge
Copy link
Member Author

karenetheridge commented May 25, 2020

It's not feasible to fully specify annotations that might-have-been because...

completely agreed! I think this means that the "verbose" output format might need to be reworked, or moved into the "not-required-but-some-implementations-may-support-it" category.

edit: #679 (comment) shows that at one point the core team did grasp what was going on here and did not see any contradictions in it :) Perhaps there is no issue here that needs resolving. I hope to determine that for myself soon when I get to implementing the "detailed" and "verbose" result formats.

@Relequestual
Copy link
Member

Looking at 10.4.4, I don't see any examples of annotations being passed through to output.
It's not unreasonable to expect annotations of schemas that were applied but failed to be collected for errors.
I'd think of this as "error info collection" as opposed to "annotation collection", which is clear cannot happen for failed validation (and makes a lot of sense).

In terms of what @handrews said...

It's not feasible to fully specify annotations that might-have-been because you get into very complex questions about conditional schema application

Yeah, I believe the verbose error reporting can ONLY produce error results for schema parts which were actually APPLIED to the instance.

@ssilverman Can I see some verbose output examples from you when you have failing subschemas with annotations please? Also for allOf.

I haven't poked at @jdesrosiers hyperjump, but the web based version does fast failing, so you only get the first failure.

@ssilverman
Copy link
Member

ssilverman commented Aug 1, 2020

@Relequestual I'm not completely clear what the request is, so I created the following example to start. Let's start here and modify until it's what you'd like to see.

Schema:

{
  "$id": "http://github.com/json-schema-org/json-schema-spec/issues/939",
  "allOf": [
    {
      "not": true,
      "title": "The Title"
    }
  ]
}

Instance:

{

}

Basic errors:

{
    "valid": false,
    "errors": [
        {
            "keywordLocation": "",
            "absoluteKeywordLocation": "https://github.com/json-schema-org/json-schema-spec/issues/939",
            "instanceLocation": "",
            "error": "schema didn't validate"
        },
        {
            "keywordLocation": "/allOf",
            "absoluteKeywordLocation": "https://github.com/json-schema-org/json-schema-spec/issues/939#/allOf",
            "instanceLocation": "",
            "error": "invalid subschemas: 0"
        },
        {
            "keywordLocation": "/allOf/0/not",
            "absoluteKeywordLocation": "https://github.com/json-schema-org/json-schema-spec/issues/939#/allOf/0/not",
            "instanceLocation": "",
            "error": "\"not\" didn't validate"
        }
    ]
}

Annotations when not collecting annotations for failed schemas:

{
    "annotations": []
}

Annotations when collecting annotations for failed schemas (note the "valid=false"):

{
    "annotations": [
        {
            "instanceLocation": "",
            "keywordLocation": "/allOf/0/title",
            "absoluteKeywordLocation": "https://github.com/json-schema-org/json-schema-spec/issues/939#/allOf/0/title",
            "annotation": {
                "name": "title",
                "valid": false,
                "value": "The Title"
            }
        }
    ]
}

@jdesrosiers
Copy link
Member

Hyperjump JSV doesn't collect annotations, so I'm not going to be any help.

@Relequestual
Copy link
Member

Thanks @ssilverman
I think what I was implying was, I want to see what you expect to happen when there are multiple branches of the allOf.

Regardless, I've considered...

@karenetheridge Do you think there's any merit in collecting non-annotations (I'm calling them non-annotations because they are never attached to an instance location) for failed validation processes?

Given the failures provide a schema location, anyone wanting such non-annotations could look up in the schema to see what they are (based on the schema location attribute).

If you agree there's no point, then I'm going to suggest we remove the part where it talks about "annotations for failed validations", unless @gregsdennis pipes up to explain what this meant...

@gregsdennis
Copy link
Member

gregsdennis commented Aug 5, 2020

That verbiage only pertains to the Verbose output format. The intent was that, if the client is requesting verbose, they probably want as much info as possible, so we give them everything, even annotations that would otherwise be removed.

Maybe this can be an implementation prerogative. If so, we should update the text as such.

@gregsdennis
Copy link
Member

[@handrews] very complex questions about conditional schema application and exactly how much needs to be faked in order to find might-have-been annotations.

From an implementor's point of view, collecting annotations on a node you're already evaluating is trivial, unless you have shortcutting logic to quickly fail. But if they're requesting verbose output, performance is an unlikely concern as the output content take priority.

@Relequestual
Copy link
Member

OK. I think we're getting there.

@ssilverman I think in your example, the validproperty was supposed to be on the node object as opposed to the annotation object? (up a level).


Addressing...

[@handrews] It's not feasible to fully specify annotations that might-have-been because you get into very complex questions about conditional schema application and exactly how much needs to be faked in order to find might-have-been annotations.

I wonder @handrews if you're thinking about subschemas which are not actually applied as part of processing? Obviously that would be insane. It's impractical to suggest that the verbose ouput somehow magically works out how to apply all subschemas (which might not even be possible with the instance provided).

I think, as @gregsdennis says here, if the user is wanting verbose output, the annotation collection would be enhanced, and collected regardless of the validation result, but STILL only based on applicability, and without short circiting validation (because verbose).

@gregsdennis @handrews Are we on the same page here?


I do have a follow up though...

It's actually a bit of a sidetrack... but I'd like to keep it here anyway Given a schema and instance such as...
{
  "$schema": "https://json-schema.org/draft/2019-09/schema",
  "not": {
    "allOf":[
      {
        "title": "Something",
        "required": ["a", "b"]
        },{
          "title": "Something Else",
          "required": ["c", "d"]
      }
    ]
  }
}

and instance...

{
  "c": 1,
  "d": 1
}

How does the verbose result look?

The subschema allOf[1] is valid, but overall the schema applied to the instance is not valid.

@ssilverman
Copy link
Member

ssilverman commented Aug 5, 2020

@Relequestual The "annotations" output I have there is just my own thing and I chose to use "valid" to indicate whether the annotation was the result of a valid collection or the result of a failed validation (there's an option in my API to collect them). That "valid" property you mention is for the various "errors" outputs. (Mostly because my understanding is that all the output formats only show "errors" and not "annotations". Please correct me if I'm wrong; there's no examples to the contrary in the spec.)

@Relequestual
Copy link
Member

@ssilverman The verbose example at https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.10.4.4

I hadn't considered that you'd collect annotations separate from the verbose output structure.

I'd have expected it as part of the verbose output structure as with successful validation, as allowed per

"annotations": {

@ssilverman
Copy link
Member

ssilverman commented Aug 5, 2020

@Relequestual Here's what I get for the basic output:

{
    "valid": true,
    "errors": []
}

@gregsdennis Something I just realized I've been doing for basic output is excluding all failing errors if the schema passes. Is this correct? There needs to be lots more examples in the schema spec for all possible cases.

Output with everything included, including "pruned" and non-failing errors (I'm just dumping all the errors, I didn't feel like creating the tree structure):

{
    "valid": true,
    "errors": [
        {
            "keywordLocation": "",
            "absoluteKeywordLocation": "file:schema-ann2.json",
            "instanceLocation": "",
            "valid": true
        },
        {
            "keywordLocation": "/$schema",
            "absoluteKeywordLocation": "file:schema-ann2.json#/$schema",
            "instanceLocation": "",
            "valid": true
        },
        {
            "keywordLocation": "/not",
            "absoluteKeywordLocation": "file:schema-ann2.json#/not",
            "instanceLocation": "",
            "valid": true
        },
        {
            "keywordLocation": "/not/allOf",
            "absoluteKeywordLocation": "file:schema-ann2.json#/not/allOf",
            "instanceLocation": "",
            "error": "invalid subschemas: 0",
            "valid": false,
            "pruned": true
        },
        {
            "keywordLocation": "/not/allOf/0/required",
            "absoluteKeywordLocation": "file:schema-ann2.json#/not/allOf/0/required",
            "instanceLocation": "",
            "error": "required properties not found: \"a\", \"b\"",
            "valid": false,
            "pruned": true
        },
        {
            "keywordLocation": "/not/allOf/0/title",
            "absoluteKeywordLocation": "file:schema-ann2.json#/not/allOf/0/title",
            "instanceLocation": "",
            "valid": true
        },
        {
            "keywordLocation": "/not/allOf/1/required",
            "absoluteKeywordLocation": "file:schema-ann2.json#/not/allOf/1/required",
            "instanceLocation": "",
            "valid": true
        },
        {
            "keywordLocation": "/not/allOf/1/title",
            "absoluteKeywordLocation": "file:schema-ann2.json#/not/allOf/1/title",
            "instanceLocation": "",
            "valid": true
        }
    ]
}

@ssilverman
Copy link
Member

ssilverman commented Aug 5, 2020

@Relequestual there's no annotations in that link you mentioned (https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.10.4.4). (I'm not referring to valid/invalid as "annotations". When I think "annotations", I think "collected annotations, for example 'format' or 'title', or even 'properties.")

Update: Oh, I see what you mean. Does the spec even mention "annotations" with an example outside of that .json file? (Yes, I know it calls "passing validation" "annotation", but there's no examples, so I just presumed to make my own format.)

@gregsdennis
Copy link
Member

I didn't include an annotations example in the spec. I assumed it would be trivial given the errors example. Probably a wrong assumption...

@ssilverman what do you mean by "non-failing errors?" Is this like an anyOf where some of the items fail?

@relequestral for your example, I would expect errors for the missing a and b plus both titles in a full verbose, but excluding the a/b title would probably be acceptable. Would definitely have the title for the c/d subschema even though the whole thing fails.

@ssilverman
Copy link
Member

ssilverman commented Aug 6, 2020

@gregsdennis Since "errors" is used in all the examples, even for "valid=true" errors, my understanding that an "error" is the result of a schema (or keyword) validation. It's a little confusing because "annotation" is supposed to be the name of a "passing schema" "error annotation". I've always wondered about this redundancy/apparent contradiction.

TLDR:
Here's the object model in my implementation:

  • An "annotation" is an object containing:
    • A name
    • Some value
    • An instance/schema location
    • A validity flag
  • An annotation's validity is true if it should be included, and false if it should be discarded, for example for a failing subschema. Note that "validity" here indicates processing context and not "schema validation result". This concept is not related to the "valid" property under "errors" from the specification.
  • I've split error collection and annotation collection because it made more sense to me. "Errors" are the result of a schema validation, internally implemented as an "annotation" object having a "validation result" as its value.
  • A "validation result" is either true or false with an accompanying message. For example, if the application of a "required" keyword in a schema failed, then the message would be something like, "required properties not found: \"a\", \"b\"". (See one of the pruned results in resolve ambiguity regarding annotation inclusion in failed validation results #939 (comment).)
  • An error's name is "annotation" if the schema result is true and "error" if the schema result is false, per the specification. This is what I think confuses things.

Through some experimentation with real-world examples, I sought to include relevant errors and exclude errors that weren't relevant. With annotations, I sought to do the same thing, but following the spec’s suggestions that subschema annotation results needn't be included.

Here's my logic of how I collect annotations:

  • Annotations are collected when a keyword needs to collect them. For example, "title", "format", "properties", etc.
  • When the validation result of a schema is false, then all annotations applied to the current instance location, and from any dynamic keyword location whose path starts with the current dynamic keyword location, are either invalidated ("validity", above) or removed, depending on the options settings (eg. https://github.com/ssilverman/snowy-json/blob/master/README.md#option-collect_annotations_for_failed).
  • Errors are collected for all keywords. An "error", recall, is an internal annotation object having a "validation result" as a value.
  • When the validation result of a schema is true, then all errors applied to the current instance location plus all instance location children, and from any dynamic keyword location whose path starts with the current dynamic keyword location, and whose validation result is false, are invalidated. The reasoning behind this is that if a schema (or subschema) passes, and if a developer needs to debug, they may not need to drill down into all the failing things inside that schema. This logic made it way easier, at least for my examples, to find schema errors because there's lots less noise. The information is still there, though; no errors are discarded, just invalidated. In a future post, or if one of you reaches out to me, I can demonstrate with an example why I arrived at this logic. i.e. Maximum information and minimum noise.

To summarize: I think of an "annotation" as extra information and an "error" as a schema or keyword validation result. Internally, I implement an "error" as an "annotation" having a "validation result" object. Splitting errors and annotations up this way made more sense to me and helped me implement things. I really believe "annotation" and "error result" need to be thought of as separate, even though they may be implemented the same way or with the same structure. I believe my implementation was successful much quicker once I saw this difference. I'm curious if other implementers feel the same way or have the same thought process.

Code reference: https://github.com/ssilverman/snowy-json/blob/099c0a97bc6d128ad371cfe1906660712755fcf8/src/main/java/com/qindesign/json/schema/ValidatorContext.java#L1232

@Relequestual
Copy link
Member

@relequestral for your example, I would expect errors for the missing a and b plus both titles in a full verbose, but excluding the a/b title would probably be acceptable. Would definitely have the title for the c/d subschema even though the whole thing fails. - @gregsdennis

As I understand, you would need both, because *Of keywords apply all the subschemas in order to get all annotations.

@Relequestual
Copy link
Member

@ssilverman I'm reading the rest of your comment, but I was trying to say, "validity" is NOT an annotation. It's a possible property of the node.

@Relequestual
Copy link
Member

Annotations are ONLY ever attached to an instance. If it's not attached to the instance, it's not an annotation (yet).
The annotations are telling you something about the instance at that location.

Anything the "annotations" collected for nodes (of the output) where validation failed, are "here's the annotations you would have gotten if this node had validated successfully".

@ssilverman you can do anything you want internally, but the output formats are defined. They were designed so there would be uniformity across implementations. There will be tests (which you will currently fail) if you don't include the annotations in the verbose output, and if you include annotations which shouldn't be there.

I'd really like to stay on topic here.

@Relequestual
Copy link
Member

I'm going to summarise.

The spec says...

Schema objects that produce a false assertion result MUST NOT produce any annotation results, whether from their own keywords or from keywords in subschemas.

It does not say you MUST NOT collect the annotations, just that you must not produce them (in the output).

I would furthe argue that the annotations key for a node which has valid: false are NOT actually annotations results.

We COULD change the key for this to make it clearer.
OR we could just make the fact that, for valid: false, the values in annotations are NOT "annotation results attached to instance locations".

Thoughts @karenetheridge ?

@ssilverman
Copy link
Member

ssilverman commented Aug 6, 2020

@Relequestual yes, that was long-winded in my post. I thought it would be useful to see though process around how it’s possible to think about errors, annotations, and their collection. I’ve added “TLDR” after the first paragraph. I hope the point of the first paragraph wasn’t lost, though. :)

Another question I had was about the output for non-simple annotations. Since, for example, the annotation for “properties” (and related) is a set; how should that look? Right now, I’m just adding a JSON array. Alternatively, it could be mandated that the output is the string form of the annotation contents, but then it’s harder to keep implementations uniform. Or other values that are some implementation-defined non-simple type.

Bottom line is that I believe that more examples covering all the output cases would be helpful for implementers. That might help resolve these ambiguities. (To state the obvious.) But maybe this is all about what needs to go in those examples and I have some catching up to do... :)

@karenetheridge
Copy link
Member Author

karenetheridge commented Aug 11, 2020

@karenetheridge Do you think there's any merit in collecting non-annotations (I'm calling them non-annotations because they are never attached to an instance location) for failed validation processes?

If they're not associated with an instance location, then what are they? all we have is a path to the annotation keyword in the schema itself. Sure, we could collect them as we traverse the schema during evaluation, and not collect the same one twice if we happen to traverse the same schema path multiple times with different instance locations... however, even if don't short-circuit (e.g. don't traverse the other allOf branches once we hit the first failing branch) we can't be sure to collect them all -- as some subschemas are only traversed for certain data types (consider subschemas under "properties" when the instance is an array type), or even conditionals under an "if", "not" etc. But even then... I could maybe see an application having a use for such a thing... shrug? As an application developer I would probably just traverse the schema myself and extract all the annotation keywords I cared about.

f you agree there's no point, then I'm going to suggest we remove the part where it talks about "annotations for failed validations", unless @gregsdennis pipes up to explain what this meant...

I'd be in favour of this, as holding on to annotations after a subschema has been evaluated as false would make implementations much more complicated. It is difficult enough properly implementing a configurable short-circuit mode.

I haven't yet implemented the 'verbose' output format because it would require restructuring the entire logic of error and annotation collection (keeping everything no matter the result, and marking whether errors were part of an successful evaluation path, or annotations from a failed path). The 'detailed' format, by contrast, can be constructed after the fact, as it is just a matter of walking the flat list of errors or annotations and creating nesting using the path components in keywordLocation as a guide. I structured my API such that the output format could be requested after evaluation had taken place (although if output_format="flag" is specified up front, short-circuiting might be enabled which will limit the number of errors in the result). tldr I'd be happy to see the verbose format sent to the bitbucket in the sky.


@Relequestual there's no annotations in that link you mentioned (https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.10.4.4). (I'm not referring to valid/invalid as "annotations". When I think "annotations", I think "collected annotations, for example 'format' or 'title', or even 'properties.")

I agree with @gregdennis that the similarity between annotations and errors are fairly clear. The accompanying output schema backs this up. It would certainly help if the test suite had some output examples to work from, but we can add a few more to the spec as well.

My take: Once a subschema, or even a particular keyword within a subschema, has a failing result, all annotations collected below that point are moot and thrown away. Errors are easier, because once a failing result has been collected at this keyword (or below) it can't become successful again, with the exception of 'contains'+'minContains' or 'oneOf', where even when we have a failing result we need to hold out for the possibility that the overall result may be true and the collected errors at this point aren't relevant.

I implement my 'error' and 'annotation' classes very similarly - they have instance and schema locations attached to them, and the actual error/annotation value themselves (where with error, it's always a string, and with annotations the value could be anything). There may be additional data attached to the error/annotation internally (e.g. I also collect the keyword itself, even though it can be derived from keywordLocation, because the caller can use it when searching for annotations by instance location + keyword later on when provided with the results in object format rather than serialized to json).

Using the allOf+not+title example provided earlier, this is what my implementation does:

$ perl -MJSON::Schema::Draft201909 -MJSON::MaybeXS -wle'print JSON::MaybeXS->new(canonical=>1,pretty=>1)->encode(JSON::Schema::Draft201909->new(collect_annotations=>1, output_format=>"basic")->evaluate(
{}, 
{ 
  "\$id"=>"http://github.com/json-schema-org/json-schema-spec/issues/939", 
  allOf=>[{not=>JSON::PP::true, title=>"The Title"}] 
}, 
)->TO_JSON)'
{
   "errors" : [
      {
         "absoluteKeywordLocation" : "http://github.com/json-schema-org/json-schema-spec/issues/939#/allOf/0/not",
         "error" : "subschema is valid",
         "instanceLocation" : "",
         "keywordLocation" : "/allOf/0/not"
      },
      {
         "absoluteKeywordLocation" : "http://github.com/json-schema-org/json-schema-spec/issues/939#/allOf",
         "error" : "subschema 0 is not valid",
         "instanceLocation" : "",
         "keywordLocation" : "/allOf"
      }
   ],
   "valid" : false
}

and, flipping that 'not' condition so it results in a true result:

$ perl -MJSON::Schema::Draft201909 -MJSON::MaybeXS -wle'print JSON::MaybeXS->new(canonical=>1,pretty=>1)->encode(JSON::Schema::Draft201909->new(collect_annotations=>1, output_format="basic")->evaluate(
{}, 
{ 
  "\$id"=>"http://github.com/json-schema-org/json-schema-spec/issues/939", 
  allOf=>[{not=>JSON::PP::false, title=>"The Title"}] 
}, 
)->TO_JSON)'
{
   "annotations" : [
      {
         "absoluteKeywordLocation" : "http://github.com/json-schema-org/json-schema-spec/issues/939#/allOf/0/title",
         "annotation" : "The Title",
         "instanceLocation" : "",
         "keywordLocation" : "/allOf/0/title"
      }
   ],
   "valid" : true
}

Another question I had was about the output for non-simple annotations. Since, for example, the annotation for “properties” (and related) is a set; how should that look?

A set is an unordered list. So for this evaluation:

perl -MJSON::Schema::Draft201909 -MJSON::MaybeXS -wle'print JSON::MaybeXS->new(canonical=>1,pretty=>1)->encode(JSON::Schema::Draft201909->new(collect_annotations=>1)->evaluate(
{"hello",1,"prop2"=>1}, 
{ 
  "\$id"=>"http://github.com/json-schema-org/json-schema-spec/issues/939", 
  properties=>{"hello"=>{title=>"a title"},"prop2"=>{}} 
}, 
)->TO_JSON)'
{
   "annotations" : [
      {
         "absoluteKeywordLocation" : "http://github.com/json-schema-org/json-schema-spec/issues/939#/properties/hello/title",
         "annotation" : "a title",
         "instanceLocation" : "/hello",
         "keywordLocation" : "/properties/hello/title"
      },
      {
         "absoluteKeywordLocation" : "http://github.com/json-schema-org/json-schema-spec/issues/939#/properties",
         "annotation" : [
            "hello",
            "prop2"
         ],
         "instanceLocation" : "",
         "keywordLocation" : "/properties"
      }
   ],
   "valid" : true
}

@handrews
Copy link
Contributor

handrews commented Apr 28, 2021

@Relequestual

The spec says...

Schema objects that produce a false assertion result MUST NOT produce any annotation results, whether from their own >> keywords or from keywords in subschemas.

It does not say you MUST NOT collect the annotations, just that you must not produce them (in the output).

I never anticipated anyone splitting that particular hair! "collecting" to me is stuff that happens from a given schema object down (or kind of the whole process, I never defined it precisely), and "producing" is that schema handing the collection up to its parent. Of course you do some collection even on a schema object that fails, as you might not be able to tell if it fails until after evaluating some subschemas. Once a schema fails, you "drop" the collected annotations in the sense of not passing them to the parent alongside the false validation result (a.k.a. do not "produce" them).

@handrews
Copy link
Contributor

Is any part of this just a clarification that can be put in, or does it require an actual conformance-impacting change?

@gregsdennis
Copy link
Member

However, this is contradicted by other parts of the spec
(https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.10.4.4):

This includes sub-schema validation results that would otherwise be removed (e.g. annotations for failed validations, successful validations inside a not keyword, etc.).

A bit more clarification here: the inclusion of annotations for failed validations was not so that they could be processed later by subsequence validations but rather so that any human reader (which is the assumption of the verbose format) would be able to properly debug the validation. This inclusion is for (😏) annotation purposes only.

Contrast this with https://json-schema.org/draft/2019-09/json-schema-core.html#rfc.section.7.7.1.2:

Schema objects that produce a false assertion result MUST NOT produce any annotation results, whether from their own keywords or from keywords in subschemas.

which is specifically about the validation process itself. These two are not really in conflict as they target different uses.

However, yes, this does need to be clarified. With the latest proposed changes to the output, this language will likely come out anyway.

@handrews
Copy link
Contributor

However, yes, this does need to be clarified. With the latest proposed changes to the output, this language will likely come out anyway.

Is this specific change being tracked somewhere? I've just found myself sorting through this long conversation for about the 4th or 5th time only to realize that there doesn't seem to be anything else that needs doing. If it's not being tracked, can we file it separately and close this?

@handrews
Copy link
Contributor

@gregsdennis meant to ping you on that last comment

@gregsdennis
Copy link
Member

It appears that this text was replaced with simply

All output units are included in this format.

I've opened ☝️ to address this with the new output format.

In specific regard to annotations, any output unit that's included should retain any errors/annotations that were created as a part of that subschema's evaluation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

6 participants