Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add output formatting tests #247

Closed
gregsdennis opened this issue Jan 7, 2019 · 20 comments · Fixed by #265
Closed

Add output formatting tests #247

gregsdennis opened this issue Jan 7, 2019 · 20 comments · Fixed by #265
Labels
missing test A request to add a test to the suite that is currently not covered elsewhere.
Milestone

Comments

@gregsdennis
Copy link
Member

gregsdennis commented Jan 7, 2019

In light of draft2019-09's output formatting, I think it might be prudent to have a set of tests that validate the output formatting.


Maybe this can be included in the draft2019-09 tests themselves (when they're added).

Something like:

Example test from draft-07's *properties.json*

{
  "description": "object properties validation",
  "schema": {
    "properties": {
      "foo": {"type": "integer"},
      "bar": {"type": "string"}
    }
  },
  "tests": [
    {
      ...
    },
    {
      "description": "one property invalid is invalid",
      "data": {"foo": 1, "bar": {}},
      "valid": false,
      "outputs": {
        "flag": {
          "valid": false
        },
        "basic":{
          "valid" : false,
          "errors" : [
            {
              "valid" : false,
              "keywordLocation" : "#/properties",
              "instanceLocation" : "#"
            },
            {
              "valid" : false,
              "keywordLocation" : "#/properties/foo/type",
              "instanceLocation" : "#/foo"
            },
            {
              "valid" : false,
              "keywordLocation" : "#/properties/bar/type",
              "instanceLocation" : "#/bar"
            }
          ]
        },
        "detailed": {
          "valid" : false,
          "keywordLocation" : "#/properties",
          "instanceLocation" : "#",
          "errors" : [
            {
              "valid" : false,
              "keywordLocation" : "#/properties/foo/type",
              "instanceLocation" : "#/foo"
            },
            {
              "valid" : false,
              "keywordLocation" : "#/properties/bar/type",
              "instanceLocation" : "#/bar"
            }
          ]
        },
        "verbose": {
          "valid" : false,
          "keywordLocation" : "#",
          "instanceLocation" : "#",
          "errors" : [
            {
              "valid" : false,
              "keywordLocation" : "#/properties",
              "instanceLocation" : "#",
              "errors" : [
                {
                  "valid" : false,
                  "keywordLocation" : "#/properties/foo",
                  "instanceLocation" : "#/foo",
                  "errors" : [
                    {
                      "valid" : false,
                      "keywordLocation" : "#/properties/foo/type",
                      "instanceLocation" : "#/foo"
                    }
                  ]
                },
                {
                  "valid" : false,
                  "keywordLocation" : "#/properties/bar",
                  "instanceLocation" : "#/bar",
                  "errors" : [
                    {
                      "valid" : false,
                      "keywordLocation" : "#/properties/bar/type",
                      "instanceLocation" : "#/bar"
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    },
    {
      ...
    }
  ]
}

Doing this on all the tests, though, seems like overkill. I think a set of tests specifically for output would be more concise.

@vearutop
Copy link

I'd like to have sample outputs for *Of, if, then, else and not.

If we have a reference implementation we can easily create validation results for all test cases, which can be helpful before we have a dedicated output test suite.

I think it make sense to stick to basic test suite instead of dedicated output test suite, as it is already mature and has good overall coverage.

@Relequestual Relequestual pinned this issue Jun 5, 2019
@Julian Julian reopened this Jun 30, 2019
@Julian Julian added missing test A request to add a test to the suite that is currently not covered elsewhere. draft support labels Sep 27, 2019
@handrews handrews changed the title Draft-08 output formatting tests Draft-08 / 2019-09 output formatting tests Nov 11, 2019
@handrews handrews added this to the 2019-09 milestone Nov 11, 2019
@Julian Julian changed the title Draft-08 / 2019-09 output formatting tests Draft2019-09 output formatting tests Nov 29, 2019
@Julian Julian changed the title Draft2019-09 output formatting tests Add output formatting tests Jun 28, 2022
@gregsdennis
Copy link
Member Author

Now that the current format has been out for a while, and we've received a lot of good feedback about it (which was the intent behind not making it a hard requirement), we're looking to update it for the next iteration of the spec. Discussion can be found here.

The question I'd like to look into here is whether this test suite needs to include tests for the 2019-09 / 2020-12 formats and the new formats (whatever that ends up being). Based on the current test structure, I expect so, but it's still worth asking.

@Julian
Copy link
Member

Julian commented Jul 26, 2022

I have to review that thread -- it's crossed my eye but I haven't tried to understand the change yet -- regardless I think agree with the point.

whether this test suite needs to include tests for the 2019-09 / 2020-12 formats and the new formats (whatever that ends up being).

I think this is what you're suggesting too, but yeah we should have the 2019-09 and 2020-12 tests include output tests in the form defined "back then", and future drafts in the newer format once it's settled on in that thread.

@gregsdennis
Copy link
Member Author

A nuance to this is that I'm thinking of including some language in the next iteration of the spec around implementations only needing to support output from the latest spec version. I think this reduces the burden for implementors quite a bit, but it does mean that a 2020-12 validation could output in the "draft-next" format if "draft-next" is supported.

My concern is that if the 2020-12 output is included, that means implementations which follow this advice won't pass the older format tests.

This also hints at https://github.com/orgs/json-schema-org/discussions/192 regarding deprecating older versions of the spec.

@Julian
Copy link
Member

Julian commented Jul 29, 2022

That'll potentially be a problem going forward too right? If we put draft-next output in draft-next, an implementation of draft-next-4 may eventually not be able to run them because it again changes the format. We may (or may eventually) just need to make a helper script which back-or-forward ports the output dynamically? But I'm definitely ok to start simple.

@gregsdennis
Copy link
Member Author

I'm not sure how comprehensive we want to be with the test suite (e.g. do we want a test that captures whether title was collected as an annotation, or do we just want "this is what annotation output looks like?"), but if we have a 2020-12 output suite, then a 2023-XX output suite, I expect implementors could just grab the ones they conform to.

Sounds like this might also be related to deprecating spec versions and possibly even moving output to its own spec

@Julian
Copy link
Member

Julian commented Jul 29, 2022

To me probably both, and they do independent things:

"this is what annotation output looks like?"

That tests that an implementation generates the output format correctly from a known set of errors or annotations

do we want a test that captures whether title was collected as an annotation

and that tests that the implementation has properly generated annotations themselves as specified by the spec for each keyword.

I'd do the former before the latter since it should be a smaller amount of work but yeah seems to me we should have both no?

@gregsdennis
Copy link
Member Author

Yeah, I agree. Start with "shape of annotations" by having e.g. title tests, but then continue to fill out other keywords over time.

@gregsdennis
Copy link
Member Author

gregsdennis commented Aug 1, 2022

I'm starting to play with this a bit, and it seems like we need two kinds of tests:

  1. Output structure - Does the implementation build the expected output structure? These tests can be contrived so as to create significantly (enough) nested output structures so that each format can be checked. For these, we're not checking keywords, per se.
  2. Keyword output - Does each keyword produce the expected annotation or error. Because we don't specify wording for errors, we're not worried about the error message itself, just whether there should be one.

There's obviously going to be some overlap here (and with the main suite), but by separating these out, I think we can create different test files that focus on different things. It'll help keep things straight in our heads at least.

1. Output structure tests

For the output structure tests, I think we can probably have a file structure like

[
    {
        "description": "valid results",
        "schema": {
            "type": "object",
            "properties": {
                "foo": { "type": "string" }
            }
        },
        "tests": [
            {
                "description": "empty object is valid",
                "data": {},
                "flag": { "valid": true },
                "basic": {
                    ...
                },
                "detailed": {
                    ...
                },
                "verbose": {
                    ...
                }
            },
            {
                "description": "object with string foo is valid",
                "data": { "foo": "bar" },
                "flag": { "valid": true },
                "basic": {
                    ...
                },
                "detailed": {
                    ...
                },
                "verbose": {
                    ...
                }
            },
            ...
        ]
    },
    {
        "description": "invalid results",
        "schema": {
            "type": "object",
            "properties": {
                "foo": { "type": "string" }
            }
        },
        "tests": [
            ...
        ]
    }
]

"valid results" and "invalid results" are probably not the best split, but... example.

Essentially, you have multiple instances per schema that produce different annotation/error sets, and you list all of the formats for each one. This is very similar to the existing test files, so it should be familiar.

2. Keyword output tests

For the keyword output tests, we'll again do something similar to the existing tests.

[
    {
        "description": "properties outputs annotations",
        "schema": {
            "$id": "https://json-schema.org/tests/properties",
            "properties": {
                "foo": { "type": "string" }
            }
        },
        "tests": [
            {
                "description": "empty object has no annotations",
                "data": {},
                "basic": {
                    "valid": true,
                    "keywordLocation": "",
                    "absoluteSchemaLocation": "https://json-schema.org/tests/properties#",
                    "instanceLocation": ""
                }
            },
            {
                "description": "object with string foo captures property as annotation",
                "data": { "foo": "bar" },
                "basic": {
                    "valid": true,
                    "keywordLocation": "",
                    "absoluteSchemaLocation": "https://json-schema.org/tests/properties#",
                    "instanceLocation": "",
                    "annotations": [
                        {
                            "valid": true,
                            "keywordLocation": "/properties",
                            "absoluteSchemaLocation": "https://json-schema.org/tests/properties#/properties",
                            "instanceLocation": "",
                            "annotation": [ "foo" ]
                        }
                    ]
                }
            },
            {
                "description": "object with unspecified property captures no annotations",
                "data": { "bar": "foo" },
                "basic": {
                    "valid": true,
                    "keywordLocation": "",
                    "absoluteSchemaLocation": "https://json-schema.org/tests/properties#",
                    "instanceLocation": ""
                }
            }
        ]
    },
    {
        "description": "properties outputs errors",
        "schema": {
            "$id": "https://json-schema.org/tests/properties",
            "properties": {
                "foo": { "type": "string" }
            }
        },
        "tests": [
            {
                "description": "object with number foo is invalid",
                "data": { "foo": 42 },
                "basic": {
                    "valid": false,
                    "keywordLocation": "",
                    "absoluteSchemaLocation": "https://json-schema.org/tests/properties#",
                    "instanceLocation": "",
                    "errors": [
                        {
                            "valid": false,
                            "keywordLocation": "/properties",
                            "absoluteSchemaLocation": "https://json-schema.org/tests/properties#/properties",
                            "instanceLocation": "",
                            "error": "expected a string but found a number"
                        }
                    ]
                }
            }
        ]
    }
]

Here, instead of listing out all of the possible output formats, we only use the basic format because it's the simplest format that gives us annotations & errors.

Again, the focus for these is whether the right data comes out of the keyword rather than the shape of the data.

Having an error message here is a little concerning. For one thing, we're not trying to define any error messaging (the spec explicitly avoids it), but having something here may imply that we're endorsing this messaging as a standard. Secondly, implementations that use different messaging would have a hard time verifying that the output they produce is equivalent to what's expected by the test.

@Julian
Copy link
Member

Julian commented Aug 2, 2022

Broadly all the above looks good to me. Minor bikeshedding is I'd slightly prefer an extra level of nesting, i.e.:

        "tests": [
            {
                "description": "empty object is valid",
                "data": {},
                "output": {
                    "flag": { "valid": true },
                    "basic": {
                        ...
                    } ,
                    "detailed": {
                        ...
                    },
                    "verbose": {
                        ...
                    }
                }
              }

to keep things a bit better grouped.

Here, instead of listing out all of the possible output formats, we only use the basic format because it's the simplest format that gives us annotations & errors.

I don't know a way around this but this effectively forces implementations into supporting the basic format to use the tests. That's not currently the case right (I'm being lazy and not looking)? By which I mean an implementation can choose to support the detailed format and not basic? If so there's a small argument to be made to use the most expressive format because we could always then downconvert it to basic. But I won't bikeshed this too much so ignore it if need be.

Having an error message here is a little concerning.

Definitely agree with this too. Do we have to include it? Presumably we could just set all of them to "< error message here >" even if so, just not to explicitly bless anything?

@gregsdennis
Copy link
Member Author

gregsdennis commented Aug 2, 2022

The spec says this:

An implementation SHOULD provide at least one of the "flag", "basic", or "detailed" format and MAY provide the "verbose" format. If it provides one or more of the "detailed" or "verbose" formats, it MUST also provide the "flag" format. Implementations SHOULD specify in their documentation which formats they support.

It's not stated well, but as is, nothing is really required. Increasing that requirement is part of the change (eventually) for the next version. (The MUST requirement was basically me saying, "if you put in the effort to write output formats, you might as well also add the flag version.)

The reason I picked basic is that, from my experience, most people seem to prefer the list. It's not set in stone, but you do have a good point that using it here kind of implies that it needs to be implemented.

I'd slightly prefer an extra level of nesting

Yeah, this makes sense, but I'd want that in both test sets to reduce modelling needs (especially for strongly-typed languages 😄).

Do we have to include [the error message]?

I think it's important to identify which keywords are expected to have some sort of message, regardless of the wording. I think a placeholder could do to say that we don't endorse any specific wording, but it doesn't resolve the "equality" checking burden.

@gregsdennis
Copy link
Member Author

Another thing we're considering is moving the output into its own spec. That would detach the output from the draft version.

How would that impact this suite? Would we have a separate folder for test formatting, or would it warrant a separate repo entirely?

@Julian
Copy link
Member

Julian commented Aug 5, 2022

Either would be fine with me honestly I think.

@karenetheridge
Copy link
Member

You're not going to be able to specify the exact structure of the expected result. There are too many ways that an exact comparison will not be correct. Even if you're only expecting a single error, the error message itself cannot be compared because every implementation will have their own. All the other fields (keywordLocation, instanceLocation etc) can be compared exactly though, as there is no room for ambiguity there.

I suggest you indicate the expected result structure not with literal data, but with json schemas. That way you can simply say "the error property must be a string" and leave it at that, and also allow for differences in ordering between errors by having a list of "contains":{...} schemas inside an allOf.

@gregsdennis
Copy link
Member Author

Even if you're only expecting a single error, the error message itself cannot be compared because every implementation will have their own.

Yeah, we anticipate this. Not sure how to go about it, but there are some ideas above.

I like the idea of using schemas though.

@gregsdennis
Copy link
Member Author

So.... The 2019/2020 output requirements are... not great. I'm not even sure that they're specified to a degree that can be tested.

@Julian what are your thoughts on giving those a pass and introducing output tests for draft-next?

@handrews
Copy link
Contributor

They're SHOULDs and MAYs, what's wrong with them?

@gregsdennis
Copy link
Member Author

They're SHOULDs and MAYs, what's wrong with them?

Are you suggesting that since they're not MUSTs, tests aren't necessary, and thus agreeing with me that we shouldn't include tests for them?


First and foremost, I don't think that I implemented them 100% right, and I used my implementation while writing the requirements. In trying to write even a few tests, I'm questioning the output that my implementation is giving me as a reference.

I've also been recommending (recently) that people hold off implementing output until draft-next is released.

@handrews
Copy link
Contributor

No, I'm saying they're entirely testable, particularly SHOULDs which will at some point get their own directory.

We ought to test the format that is currently actually available.

@Julian
Copy link
Member

Julian commented Jan 4, 2023

We should call this one closed for now by #619 yeah @gregsdennis? (Reopen obviously if you want to use it to track more work)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
missing test A request to add a test to the suite that is currently not covered elsewhere.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants