`contentSchema` tests inadequate #609

gregsdennis · 2022-11-18T00:00:17Z

I'm not sure that the contentSchema are fully testing what they intend to.

For example:

file: content.json (any draft)
case: validation of binary-encoded media type documents with schema
test: a valid base64-encoded JSON document
expected: valid

Schema:

 {
  "$schema": "https://json-schema.org/draft/next/schema",
  "contentMediaType": "application/json",
  "contentEncoding": "base64",
  "contentSchema": {
    "required": [
      "foo"
    ],
    "properties": {
      "foo": {
        "type": "string"
      }
    }
  }
}

Instance

"eyJmb28iOiAiYmFyIn0K"

(instance decodes to {"foo": "bar"})

The subschema at /contentSchema (based on the keywords present) seems to be expecting an object. However, if an implementation ignores the encoding and media type and just validates this as a string, the test incorrectly passes. (Mine does this.)

I suggest we need to add a "type": "object" restriction to the subschema to ensure that it requires an object and doesn't just validate the base64 string.

The text was updated successfully, but these errors were encountered:

Julian · 2022-11-22T17:48:16Z

Sounds reasonable to me.

gregsdennis · 2022-11-22T21:23:47Z

Actually I'm starting to question whether this needs to be in the requires test set (could be in optional).

The spec (Section 8.2) says:

Implementations MAY offer the ability to decode, parse, and/or validate the string contents automatically. However, it MUST NOT perform these operations by default, and MUST provide the validation result of each string-encoded document separately from the enclosing document. This process SHOULD be equivalent to fully evaluating the instance against the original schema, followed by using the annotations to decode, parse, and/or validate each string-encoded document.

This effectively means that contentSchema just produces the schema as an annotation, and leaves decoding the data and evaluating it against the contentSchema schema to the consumer.

Julian · 2022-11-22T21:27:59Z

Right, I thought you knew that part :) and that the point was that if an implementation for some reason does do something with contentSchema (incorrectly) that it would still pass this test because it might consider the thing to be a string.

handrews · 2022-11-22T21:41:46Z

In draft 2019-09 onwards the content* keywords are strictly annotations. There is an annoyingly confusing option to automatically run a second validation using those annotations, which few people understood and has since been removed.

In draft-07, content* (which I think did not yet include contentSchema) could optionally function as assertions, like format. This was removed because it's an absurdly enormous security problem.

The test schema and instance are correct, although a "type": "object" in the contentSchema would prevent the unexpected pass that @gregsdennis noticed.

gregsdennis · 2022-11-22T21:45:29Z

the point was that if an implementation for some reason does do something with contentSchema (incorrectly) that it would still pass this test - @Julian

But given that the requirement is "MUST NOT perform these operations by default," doesn't that mean that we need to test that the implementation isn't evaluating the schema? I'm not sure that this test does that.

I think we'd need content that doesn't pass the contentSchema schema, but the overall schema passes anyway. (For the above example, the content could decode to {"foo": 42} which doesn't pass the schema.)

Julian · 2022-11-22T22:01:24Z

I'm not sure who's responding to who with what from the last two comments even though you quoted me :D

I was trying to say the same thing Henry said though, which is there's no test we can write with a result other than valid: true (because content* does nothing in drafts other than 7) but that if you were suggesting we strengthen the test such that an implementation which incorrectly does assert using content* fails this test and not unexpectedly passes it (by adding "type": object or however you suggest doing it) that that's ok with me (though I think it's very unlikely it matters, since implementations should really have trivial implementations that do absolutely nothing, so it's hard for them to have bugs here, but I'm fine with it if you care to).

Is that different than your understanding, sounds like no now right? Basically yeah if you want to strengthen the test sure go for it, as you say, the way to do so is to have content that doesn't pass the contentSchema, and then assert that it passes anyhow..

gregsdennis · 2022-11-22T22:13:23Z

When I created the issue, I had initially misinterpreted the requirement thinking that I had to decode and validate the content (because the test has content that is valid), and the contentSchema schema is insufficient to do that as it allows non-objects.

Reading the spec more recently, I realize that I am explicitly NOT to decode and validate the content. But the content is valid against the contentSchema schema, so it's not really testing that I'm not decoding and validating the content. It's not really doing anything.

If we change the content to encode {"foo": 42} (or any other JSON data that would fail the contentSchema schema), then we can be sure that the implementation is NOT decoding and validating the content, as per the requirement.

(Sorry the direction of this issue changed after I educated myself and thought about it a bit.)

Julian · 2022-11-22T22:16:22Z

Right I think we're all saying the same thing now? Agree, change the schema, change the instance, whichever.

karenetheridge · 2022-11-23T04:17:10Z

I was trying to say the same thing Henry said though, which is there's no test we can write with a result other than valid: true (because content* does nothing in drafts other than 7)

Yes!

When I created the issue, I had initially misinterpreted the requirement thinking that I had to decode and validate the content (because the test has content that is valid), and the contentSchema schema is insufficient to do that as it allows non-objects.

Such a test would be great in optional/, where the keywords were treated as assertions as well as annotations.

gregsdennis · 2022-11-23T07:12:06Z

I noted in the PR that there's already a test that does this what I'm looking for. I added the type to be a bit more explicit.

handrews · 2022-11-23T21:11:39Z

Such a test would be great in optional/, where the keywords were treated as assertions as well as annotations.

Unfortunately the assertion behavior is only optional in draft-07 and earlier, which do not include contentSchema. In 2019-09 and later, assertion behavior is non-compliant.

Julian added the bug A test is wrong, or tooling is broken or buggy. label Nov 22, 2022

gregsdennis mentioned this issue Nov 22, 2022

added type:object to contentSchema schemas #618

Merged

Julian closed this as completed in #618 Nov 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`contentSchema` tests inadequate #609

`contentSchema` tests inadequate #609

gregsdennis commented Nov 18, 2022

Julian commented Nov 22, 2022

gregsdennis commented Nov 22, 2022

Julian commented Nov 22, 2022

handrews commented Nov 22, 2022

gregsdennis commented Nov 22, 2022 •

edited

Loading

Julian commented Nov 22, 2022 •

edited

Loading

gregsdennis commented Nov 22, 2022 •

edited

Loading

Julian commented Nov 22, 2022

karenetheridge commented Nov 23, 2022

gregsdennis commented Nov 23, 2022

handrews commented Nov 23, 2022

contentSchema tests inadequate #609

contentSchema tests inadequate #609

Comments

gregsdennis commented Nov 18, 2022

Julian commented Nov 22, 2022

gregsdennis commented Nov 22, 2022

Julian commented Nov 22, 2022

handrews commented Nov 22, 2022

gregsdennis commented Nov 22, 2022 • edited Loading

Julian commented Nov 22, 2022 • edited Loading

gregsdennis commented Nov 22, 2022 • edited Loading

Julian commented Nov 22, 2022

karenetheridge commented Nov 23, 2022

gregsdennis commented Nov 23, 2022

handrews commented Nov 23, 2022

`contentSchema` tests inadequate #609

`contentSchema` tests inadequate #609

gregsdennis commented Nov 22, 2022 •

edited

Loading

Julian commented Nov 22, 2022 •

edited

Loading

gregsdennis commented Nov 22, 2022 •

edited

Loading