Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break optional/ into more granular directories. #590

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Conversation

Julian
Copy link
Member

@Julian Julian commented Aug 23, 2022

This PR breaks our existing optional/ directory into 4 new directories, and sorts existing tests into the 4 new directories:

may/ and should/ now represent explicitly recommended or allowed
behavior.

additional/ contains other files whose applicability isn't made
explicitly clear, or which requires support for additional vocabularies.

alternatives/ contains "mutually contradicting" tests depending on
which choice was made by an implementation amongst a number of possible
options. In theory one could have should and may directories here, but things start to seem unnecessarily complex, so I've left this flat for now.

Each test in our previous optional/ directory is then sorted into one of
these directories.

The goal for the above is to be as objective as possible. The suite is, as I always say, not the spec, and not here to pepper over things, so which directory a particular file belongs in should be clear-as-day or else belongs in the additional bucket until/unless the spec makes a proposed recommendation about it.

In general this PR doesn't add any tests, except in the case of the alternatives directory where for 2 cases (content in draft 7 and format-assertion: false in 2020/19) we already had tests for one alternative and not the other.

Additional documentation for each file which links the file to the corresponding section of the spec (which hopefully outlines why or when it is optional) is left for a follow-up PR.

Clearly this is backwards incompatible for users of the suite unfortunately, but I've tried to keep the new layout as similar as possible while resolving some of the complaints about the word "optional" being unclear.

Refs: #495 #25

`may` and `should` now represent explicitly recommended or allowed
behavior.

`additional` contains other files whose applicability isn't made
explicitly clear, or which requires support for additional vocabularies.

`alternatives` contains "mutually contradicting" tests depending on
which choice was made by an implementation amongst a number of possible
options.

Each test in our previous optional/ directory is then sorted into one of
these directories.
@Julian Julian requested a review from a team as a code owner August 23, 2022 08:32
@Julian
Copy link
Member Author

Julian commented Aug 23, 2022

Obviously given this PR essentially shuttles files around, and generally does the same shuffling for all drafts, I'd recommend focusing more on where files land in a local checkout, as well as providing any feedback about the directory layout itself, more so than staring at the GitHub UI.

@gregsdennis
Copy link
Member

It's a shame files can't be treated as object references that you can prevent breaking everyone by just linking to... oh wait... ../optional/foo#tests/5/cases/2 😉

(Yeah, that wouldn't work.)

I like this change. I'll look at it in the morning.

@Julian
Copy link
Member Author

Julian commented Aug 23, 2022

I considered symlinking everything back to where it was for a bit, but honestly our backwards compatibility policy is so poorly defined it's just not worth bothering :), even though I know you were being facetious anyhow. I wanted to get to #223 sometime before doing things like this but I haven't had the head for writing up longer docs recently, and folks have begun to get in the way of changes to optional/, so it's time to do this regardless, and maybe is the right sort of motivation.

@handrews
Copy link
Contributor

I'd like to propose allowable rather than additional. additional just feels weirdly uninformative. allowable means "there is nothing in the spec that forbids this, but it's not a requirement either" which I think is the correct guidance. We could also go with not-forbidden 🙃 but allowable feels like the right balance of acknowledgement and non-endorsement.

@Julian
Copy link
Member Author

Julian commented Aug 23, 2022

I specifically want something "uninformative", and disagree that what's there is simply "allowable". In say, a file like bignum.json which deals with bignums is somewhat "obviously" correct to run in a language like Python where the default integer type is a bignum, so why wouldn't someone use it, whereas in a language without bignums, the spec says nothing about requiring someone to go find a bignum type to use, and probably it's indeed equally "obviously" correct to not go change which type represents numbers.

For things the spec wants to make recommendations on, we should make them part of the spec, otherwise I indeed don't want us making prescriptive choices here on things where the spec doesn't, it's just easy not to pick a side.

@handrews
Copy link
Contributor

OK, I see your point about bignum. Digging a bit deeper here, it looks like additional is now doing several things:

  • environment-specific functionality (whether this test applies has more to do with the OS/programming language/runtime environment/library availability than it does with the spec):
    • bignum.json
    • float-overflow.json
    • non-bmp-regex.json
  • implementation-specific choices (spec notes the behavior is possible but MUST NOT be considered interoperable, or explicitly describes an implementation-defined area that allows the behavior):
    • cross-draft.json
    • no-schema.json
    • refOfUnknownKeyword.json
  • implementation-specific configurable choices
    • format-assertion/*.json, presumably encompassing:
      • <= draft-07: implementations with format support in the default configuration (assertions on)
      • draft 2019-09: default meta-schema (format vocabulary false) but assertion support explicitly enabled
      • draft 2020-12: default meta-schema (format-annotation vocabulary true) but assertion support explicitly enabled

And then under MAY you have dependencies-compatibility.json, which is not related to any specification language at all.

Of these, it's only dependencies-compatibility.json where I would agree with additional (and disagree with may). Unlike definitions, there is nothing at all (that I can find- let me know if I missed it) in the core or validation spec about continuing to support dependencies. Choosing to do so is a random choice by an implementor with no relationship to specification requirements. This is indeed "additional" in relationship to the specification.

As for the tests currently assigned to additional...

environmental

The environmental tests would be better off in an env-specific or similarly named directory. That would immediately convey what sort of tests they are, and why an implementation might choose to run them or skip them.

implementation-specific (non-configurable)

The implementation-specific choices are what I would expect to see in an allowable or alternatives directory. The specification allows them by mentioning the possible behavior, or explicitly noting an implementation-defined area that encompasses the behavior. The only reason I can see for not making these alternatives is that the test suite is not currently capable of handling the alternatives (namely that a reference fails to resolve, or that an implementation refuses to process a $schema-less schema or an unrecognized/unsupported $schema, both of which are runtime errors rather than validation failures).

I do not think that additional is a correct description of these. The behavior being tested is accounted for within the specification, so there is a well-defined relationship to specification requirements.

implementation-specific (configurable)

For format and content.json draft-07 and earlier, since the assertion behavior is assumed to be on by default, those tests really fit in the implementation-specific bucket as well. I see you moved draft-07's content.json into alternatives, which seems correct. The behavior for format in draft-07 and earlier is specified the same as content in draft-07. They should go in the same location. If the concern is that we don't want to explicitly test the alternative behavior (which makes those tests similar to no-schema.json and refOfUnknownKeyword.json, I think that shows that splitting topics based on whether we can/want to test all alternatives or not is not ideal.

For format under additional in 2019-09 and later, these MUST fail in the default configuration (and we have required tests for that, which is great), but MAY pass if configured to support assertions. The behavior of format with the default meta-schema (for both 2019-09 and 2020-12) and assertions turned on is so vaguely specified that it's practically untestable:

When the implementation is configured for assertion behavior, it:

 SHOULD provide an implementation-specific best effort validation
 for each format attribute defined below;

 MAY choose to implement validation of any or all format attributes
 as a no-op by always producing a validation result of true;

So it's acceptable to implement a configuration option that turns on assertion behavior and then ignore it. We could leave them and assume people will run only the tests for formats that they validate, I guess, in which case they are in the same allowable/alternatives bucket as no-schema.json, etc., except that there's also an opt-in configuration involved. Otherwise, the tests are more useful with the format-assertion vocabulary. Either way, there is a well-defined relationship to specification requirements here (even if those requirements are the opposite of well-defined).

alternatives and vocabularies

For format from the format-assertion vocabulary under alternatives in 2019-09 and later, I don't think these really fit as they're not an implementation-specific choice in the same way as the ones discussed above. The format-assertion vocabulary is essentially an extension vocabulary that happens to be under the umbrella of the JSON Schema organization, in the sense that there is no requirement whatsoever in the spec to support it. We may adopt further extension vocabularies in the future.

It would be better to have a vocabularies directory, with a sub-directory per vocabulary containing tests for that vocabulary when enabled with true, as well as at least one test for enabled with false testing that keywords from the vocabulary are treated as unrecognized (either ignored or collected as annotations, and right now there's no difference as far as the test suite is concerned). This will scale better going forward, and make it clear that supporting vocabularies is different from other implementation-defined or ambiguous areas. Since vocabularies are a granularity of support, it makes sense to give them directories rather than grouping files by naming convention.

Copy link
Member

@gregsdennis gregsdennis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks fine to me. Just the one comment, and I'll let you and @handrews work out the folder structure.

I wonder if there's a way we can notify the implementation community that all of their builds are about to break. I'll post a comment in Slack, but I'm not sure what else there is to do.

"schema": {
"$id": "https://schema/using/format-assertion/false",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is removing the $id going to break things? (cc: @karenetheridge) (seen in several places)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am pretty sure no, it's just an extra unneeded validator that was in these schemas.

@Julian
Copy link
Member Author

Julian commented Aug 24, 2022

Unlike definitions, there is nothing at all (that I can find- let me know if I missed it) in the core or validation spec about continuing to support dependencies.

Probably I simply confused it with definitions, as I've done before. If it's not there, yes, agreed it should go in additional.

The environmental tests would be better off in an env-specific or similarly named directory. That would immediately convey what sort of tests they are, and why an implementation might choose to run them or skip them.

It's not clear yet to me that the extra complexity (of extra folders) is worthwhile; I'm open to the idea in theory if it turns out to be after hashing out the rest of the comments, but otherwise just because the reason for each file is different doesn't mean it needs to be in a different place. The interpretation of the folder to implementors should be "you need to look at each of these files and decide whether to run each one for your implementation" -- whether the reason is because of your programming language or additional functionality you added beyond the spec is a semantic difference but not a hugely material practical one for that decision.

I do not think that additional is a correct description of these. The behavior being tested is accounted for within the specification, so there is a well-defined relationship to specification requirements.

"Additional" means "additional tests beyond those in the top folder which everyone should run". Not additional to the spec and implying the spec doesn't mention anything about them.

For format and content.json draft-07 and earlier, since the assertion behavior is assumed to be on by default

They're not specified to be on by default, the specs say, e.g. in 7:

Implementations MAY support the "format" keyword as a validation assertion. Should they choose to do so [...]:

The language you quoted afterwards (about SHOULD provide an implementation-specific best effort validation for each format attribute defined below;) is the same language that's always been there more or less, it just didn't use to mention assertions in the lead-in pre-draft 6.

They should go in the same location. If the concern is that we don't want to explicitly test the alternative behavior (which makes those tests similar to no-schema.json and refOfUnknownKeyword.json, I think that shows that splitting topics based on whether we can/want to test all alternatives or not is not ideal.

The reason to put them in different locations is that content is boolean on-or-off, and if it's on, you run all the tests. Format is not, as you pointed out yourself, where an author who supports some formats still gets to identify which subset to run. We aren't going to test something like that in alternatives, there's an absolutely monstrous number of combinations, as every test can be on or off individually depending on the specific way someone implemented the format.

It would be better to have a vocabularies directory, with a sub-directory per vocabulary containing tests for that vocabulary when enabled with true

I considered this, and am potentially open to it in the future for additional vocabularies we may have or test, but again format is special for reasons you're aware of and mentioned -- even with it on, that doesn't mean you run all the tests. It's its own beast.

As for some other specifics:

cross-draft.json

The spec doesn't today say or encourage implementations to support multiple drafts in the first place, so while the ref behavior is well defined whether an implementation supports the second draft or not is purely for the implementer to know.

no-schema.json

You're referring I assume to some not-yet-here file about schemas without $schema? Such a thing seems like it belongs in additional as yeah the spec doesn't offer any recommendation about whether supporting it is recommended or not. Sounds like you're also saying that?

@handrews
Copy link
Contributor

You're referring I assume to some not-yet-here file about schemas without $schema?

Oh, whoops, that was a local file, sorry about that, please ignore.

"Additional" means "additional tests beyond those in the top folder which everyone should run". Not additional to the spec and implying the spec doesn't mention anything about them.

Since the whole point of this is to have directory names that are more meaningful and consistently understood, I think it's relevant that the word you have chosen isn't intuitive enough in its meaning to be clear. This meaning of "additional" is just a renamed "optional" and in my opinion not an improvement.

They [format assertions] are not specified to be on by default

The full quote (unchanged from draft-04 to draft-07) is:

 Implementations MAY support the "format" keyword as a validation
 assertion.  Should they choose to do so:

     they SHOULD implement validation for attributes defined below;

     they SHOULD offer an option to disable validation for this keyword.

You don't add an option to disable something unless it's on by default. We specifically inverted that between draft-07 and 2019-09:

  When the vocabulary is declared with a value of false, an implementation:

       MUST NOT evaluate "format" as an assertion unless it is explicitly
       configured to do so;

(the default meta-schema for 2019-09 declares the vocabulary with false, so the default configuration of the regular meta-schema requires a configuration setting to enable.

The language you quoted afterwards (about SHOULD provide an implementation-specific best effort validation for each format attribute defined below;) is the same language that's always been there more or less, it just didn't use to mention assertions in the lead-in pre-draft 6.

I don't see that sort of language pre-2019-09 at all. The only similar language is they SHOULD implement validation for attributes defined below;, and the "below" sections all mandate conformance to specific RFCs. There's no allowance for partial validation.

I added that language specifically to align the specification with the reality of implementations such as your oft-stated bit about only validating "format": "email" by looking for an @ sign. Some people objected to the weakening of the requirement (leading to debates about how to properly validate email, including people linking to various regexes alleged to do so correctly, which I remember clearly because those regexes were absurd even if they were right).

The reason to put them in different locations is that content is boolean on-or-off, and if it's on, you run all the tests.
Format is not, as you pointed out yourself, where an author who supports some formats still gets to identify which subset to run. We aren't going to test something like that in alternatives, there's an absolutely monstrous number of combinations, as every test can be on or off individually depending on the specific way someone implemented the format.

That's not what I was suggesting, I'm well aware of the combinatorics involved.

Either way (content or format), you're picking specific files out of a directory to run. It's not like the alternatives directory is organized into a tree based on configuration. And in neither case do you run a partial file, since the formats are each given their own file.

Configuration options are something we have moved away from - the only remaining one mentioned by the spec AFAIK is for format, so a directory designed around config options will have even fewer relevant tests than a directory designed around environmental support. On the other hand, an allowed (or similar) directory could accommodate both purely configuration tests (content in 2019-09), configuration+non-configuration (format as specified by the standard meta-schemas for all drafts), and non-configuration (unknown keyword reference handling).

I considered this, and am potentially open to it in the future for additional vocabularies we may have or test, but again format is special for reasons you're aware of and mentioned -- even with it on, that doesn't mean you run all the tests. It's its own beast.

This would only be for the format-assertion vocabulary in 2020-12, which is the only time we have a non-mandatory vocabulary anywhere. In 2019-09 it works differently - I might see that as a vocabulary directory but it's too late in the evening for me to try to unravel 2019-09's format again (I have to re-read the spec every time- seriously what was I thinking?) This also only really makes sense given the idea that the assertion behavior of the format-annotation vocabulary is more or less untestable (because it's different language than draft-07 and earlier).

But this is definitely the least important concern. format is weird. I think it could be treated as less weird in 2020-12 by leveraging the fact that there are two vocabularies, but if there's anything that I'd actually like to see it would be just not stating where tests for extra vocabularies go if we don't have any that are going there other than format. We can decide that when we have something to put wherever it needs to go.

As for some other specifics:

cross-draft.json

The spec doesn't today say or encourage implementations to support multiple drafts in the first place, so while the ref behavior is well defined whether an implementation supports the second draft or not is purely for the implementer to know.

I don't understand what this is responding to. The spec in 2019-09 and later doesn't specifically encourage it, but also doesn't disallow it, and talks specifically about controlling implementation behavior by recognizing $schema. The specific behavior being indicated is governed by the relevant drafts. So choosing to implement other drafts is within the allowance of the spec, as opposed to being not mentioned or implied at all (dependencies, and yeah I can see why you put that in MAY if you thought it was definitions), which is why I put it where I did.

It's definitions where there's language in the spec, dependencies
just appears in the metaschemas still.
@Julian
Copy link
Member Author

Julian commented Aug 25, 2022

I disagree with a number of things you said, but I honestly am finding it hard to tell which are relevant to this PR and therefore what changes you're suggesting at this point and which you've come around on.

Let's try perhaps focusing on one at a time. Can you share a concrete change you're suggesting is incorrect in the current layout (a file which is in a place different than you think it should be)? The dependencies file is now in the correct place. Any others?

@handrews
Copy link
Contributor

handrews commented Aug 25, 2022

I believe it is all relevant to this PR, but yes, one at a time sounds good. Let's start with format from draft-04, -06, and -07 and whether it does or does not belong with content from draft-07 (ignoring format from 2019-09 and later entirely). I don't have an outcome that I specifically want with this, but I do want to understand your reading of the spec and how it maps to this new layout. So this should be enjoyably low-stakes, with apologies for the inevitably argumentative feel that quoting specs at each other involves.

You stated that format (draft-04, -06, -07) is different from content (draft-07):

The reason to put them in different locations is that content is boolean on-or-off, and if it's on, you run all the tests. Format is not, as you pointed out yourself, where an author who supports some formats still gets to identify which subset to run.

with part of the reasoning being because unlike content (draft-07) is not on by default, while format is off by default. You quoted part of the spec to support the "off-by-default" reading. I quoted more of that part of the spec, specifically they SHOULD offer an option to disable validation for this keyword., and noted that you don't specify a configuration option for turning something off unless it is on by default.

  • do you still disagree with my reading of the spec that it is on by default if it is supported at all?
  • if you now agree, does that cause the format tests for those drafts to move over to alternatives alongside content?
  • if you agreed with the reading but do not think format should move over for 4, 6, and 7, is that just a "format is weird, don't bother even though it would naturally move over" thing or am I misunderstanding what is similar or different?

You mentioned how format allows for support of some but not all formats, but each format is in its own test. So a content (7) test and each format (4, 6, 7) test seem to me to be controlled by:

  1. Is the feature supported at all? Both content and format have "an implementation MAY support <keyword(s)> as a validation assertion" language in draft-07, and format has "Implementations MAY support the "format" keyword" in draft-06 and -04.
  2. Is the implementation in the default configuration (assertion support for the keyword(s) in question on) or has it been configured off?
  3. Does the implementation support the specific value for the keyword?
    • For contentMediaType, does it support application/json (which is likely, but technically someone could decide to only support XML or something)?
    • For format, does it support the format in question? Each of which is in its own test file and therefore can be run or not separately

So it seems to me that the decision process for content* and each individual value for format is the same. Why do they end up in different places? Again, that's a question. I'm not demanding that they end up in the same place, I'm explaining why that's what I would expect from reading your description of the layout, and trying to understand how you arrived at a different result.

@Julian
Copy link
Member Author

Julian commented Aug 26, 2022

So this should be enjoyably low-stakes, with apologies for the inevitably argumentative feel that quoting specs at each other involves.

Yep, sounds good.

do you still disagree with my reading of the spec that it is on by default if it is supported at all?

Yes. What you're quoting is if an implementation has it on by default, then it should offer an option to turn that off. But there's no need to deduce anything from that line, the spec explicitly says beforehand in the line I quoted that an implementation MAY turn format on as an assertion. Not SHOULD or MUST. If I had to explain that to someone, I'd say that means it's off by default with some implementations deciding to turn it on. But this is one of the things I was referring to as not being sure if it's relevant to this PR, as I'm not sure either way it has any bearing it's just semantics, as yes indeed the point of those tests is for an implementation which is deciding now to configure itself for format assertion behavior and run some subset of them.

if you now agree, does that cause the format tests for those drafts to move over to alternatives alongside content?

The only way format could go in alternatives would be if we were able to offer a fully "covering" alternative set of files. But that is impossible. There are n formats and each has some number of tests. We would need a file for each subset of those combined tests in the alternatives directory.

You mentioned how format allows for support of some but not all formats, but each format is in its own test.

Just to be sure -- in this repo for whatever reason (mostly because we were inconsistent on terminology previously, and Shawn wrote a README with these definitions which were as good as any others) we settled on "test" meaning something specific, and it's not the whole file. The definitions are here. So test files have test cases which have tests.

So someone can choose to run the date-time format's test file, but know they don't support leap seconds and therefore decide not to run tests related to leap seconds. Doing so is allowed by the specifications, and would again require huge numbers of combinations to offer in alternatives. I'll elaborate since I can't tell if you were disagreeing with this, you seemed to say something like this changed in 2019, but no not to my understanding, that was always the case -- the spec always said that one SHOULD implement the following formats, but that was always understood to mean one could partially implement a format, and if you did, you were basically implementing a date-time format but not precisely the one from the spec, so disregarding the SHOULD, but still implementing a thing called that. The @-for-email thing you quoted was never against the spec, it was simply disregarding the SHOULD (and in fact I recall discussing it with the spec authors way back when, but not because I thought it was against the spec, I knew it wasn't, they'd already told me people did that with the old CSS formats).

On the other hand, as you say:

For contentMediaType, does it support application/json (which is likely, but technically someone could decide to only support XML or something)?

I agree! So it seems we may indeed agree in practice -- but I just think this is quite unlikely, and a lot smaller cardinality (I don't expect any implementation which implements content actually doesn't support both JSON and base64 which is what we have there), so the extra benefit of having a full alternatives file seemed more useful. In other words yes if an implementation supported only a subset of what's in that file they'd be in the same situation as format, but we only have JSON and base64 in that file and it seems very unlikely that someone supports content as an assertion and not those, therefore I put it there. Purely based on the spec though I agree, and content could go in additional with a whole bunch of subfiles. Does that match your understanding too now then?

@handrews
Copy link
Contributor

Thanks for the detailed reply, @Julian.

Regarding format, I don't find your reasoning any more convincing than you find mine. But I picked it to discuss because it doesn't much matter to me if format's position is "weird" from my perspective as long as everything else makes sense. I don't see any upside in debating it further here.

Regarding what I classified as environment-specific tests, you have explained your position and indicated a possible openness to future change, so we're good on that point.

Regarding everything else about additional and alternatives, I continue to disagree with how you are defining and using them, but based on your comments so far I doubt I will be able to come up with any arguments that you find convincing, so I'll just state that I'm not attempting to hold up the PR and leave it at that.

@Julian
Copy link
Member Author

Julian commented Aug 29, 2022

@jdesrosiers (who I think is OOO) / @karenetheridge / @santhosh-tekuri / any other implementers/watchers, do you follow the new directories and/or have feedback about where files belong based on your readings of the spec?

@jdesrosiers
Copy link
Member

do you follow the new directories and/or have feedback about where files belong based on your readings of the spec?

I don't run the optional tests, so I don't have any opinions rooted in experience. Honestly, I haven't been following the conversation closely enough, so the following might be irrelevant.

IMO, format is a unique case, and should have it's own directory. The rest, I'm not really sure needs to be in the test suite at all, so I don't have a strong opinion on how they are organized. Those tests were in "optional" because we can't expect all implementations to behave the same for some reason or another. The test suite is about achieving standard behavior across implementations. The "optional" tests (with the exception of the "format" tests) by definition can't achieve that goal. Personally, I'd drop those tests, but I'm content just ignoring them if others feel they have more value than I do.

@Julian
Copy link
Member Author

Julian commented Sep 8, 2022

The rest, I'm not really sure needs to be in the test suite at all, so I don't have a strong opinion on how they are organized. Those tests were in "optional" because we can't expect all implementations to behave the same for some reason or another. The test suite is about achieving standard behavior across implementations

The test suite is really about reducing test burden for downstream implementations as well, or at least that was the original vision. The spec says implementations should use ecma regexes, it seems hard to justify not having tests for implementations that use them and don't want to replicate their own tests for them rather than sharing with all other implementations which do. But if you have no opinion will perhaps leave that "discussion" for elsewhere.

Any other feedback from others?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants