Aligning views on stability guarantees, their implications, and community perception #295

gregsdennis · 2023-01-17T21:30:58Z

gregsdennis
Jan 17, 2023
Maintainer

DECISION: After getting feedback from the community (thanks to everyone!), we have decided to move forward with option 1, which is including minimal breaking changes that allow us to promise that there will be none in subsequent versions, with the caveat that we provide as easy an upgrade path as possible, including instruction and possibly tooling.

Further discussion about dropping support for unknown keywords will continue in this discussion, which already contains several proposed solutions.

Recently I asked the core members to provide their thoughts on the stability status of the features we have in the spec. The collated results focused on keywords and revealed two primary opinions:

The specifications that have been published to date provide no stability guarantees, but we will be adding explicit guarantees with the next publication. Thus, this is our last chance to "fix something by breaking it." That is, if we see a problem with one of our keywords, and the only way to fix it is by changing it in a way that breaks some existing behaviors, now is the only opportunity that we will have to do so. This is only possible because we're just now making those stability promises.
The specifications that have been published to date provide no stability guarantees, but we will be adding explicit guarantees with the next publication. However, it seems disingenuous to promise "no breaking changes" while including breaking changes. It would be better to apply the stability guarantees starting with the most recent publication (draft 2020-12) so that the next publication contains no breaking changes. Only this stance fosters trust between the spec authors and its users.

Until this conversation, I was firmly of opinion 1. However, now I'm on the fence. This discussion aims to resolve this difference.

Not in question:

The next publication will include explicit stability guarantees.
These guarantees will include both backward- and forward- compatibility promises ("how far backward" is up for debate, i.e. is 2020-12 included?)

Julian · 2023-01-17T21:46:02Z

Julian
Jan 17, 2023
Maintainer

As a (the?) proponent of 2, I'm going to slightly rephrase the argument, though I don't terribly disagree with it as-is. But if I were to strengthen it without elaborating too much beyond a simple initial paragraph:

The specifications that have been published to date provide no explicitly enumerated stability guarantees, but we will be adding explicit guarantees with the next publication.
However, it seems disingenuous to promise "no breaking changes" while including breaking changes, and more so, users of released software (or specifications) tend to expect stability regardless of whether it is explicitly guaranteed, and will often forego upgrading to newer versions (of the software or specification) if they encounter instability or behavioral changes, big or small. JSON Schema as a community recognizes a large portion of its userbase (users and implementers) behaving in this way, so it stands to reason that a possible cause might be this instability between prior releases. It would be better then to apply the stability guarantees immediately -- starting with the most recent publication (draft 2020-12) regardless of the lack of explicit noted guarantees, and so that the next publication contains no breaking changes. Only this stance fosters trust between the spec authors and its users, gives additional runway for users to catch up to the existing document, and gives similar runway to implementers.

8 replies

jdesrosiers Jan 18, 2023
Maintainer

The OpenAPI 3.1 situation is a little more nuanced that. Although the default JSON Schema dialect is compatible with 2020-12, OpenAPI 3.1 allows users to use whatever dialect they choose. So, 3.1 users aren't stuck with 2020-12 the same way 3.0 users are stuck with draft-04(-ish). However, the argument is still valid because most people will use the default unless they have reason not to and some tooling may not support other dialects anyway. My hope is that a stable JSON Schema version is ready and made the default for OpenAPI 4. 3.1 has very low adoption right now, so hopefully most people end up skipping 3.1 and going straight for 4.

handrews Jan 18, 2023

OpenAPI 3.0 had a very low adoption rate when we were doing 3.1. Now 3.0 is reasonably broadly adopted, and 3.1 is just getting started. I expect that dynamic to repeat with 3.1 and 4.0. It took a very long time for 3.0 to pick up steam compared to 2.0, and 4.0 seems likely to be an even more dramatic change.

mwadams Jan 19, 2023
Collaborator

wrt OpenAPI 3.1; I agree that the extensibility nuance gives us pause; but for me the real question is "what do tool/library vendors do when faced with different JSON Schema dialects in OpenApi?" Most, I would guess [research required!], support whatever "the default" might be - and, with luck, whatever the defaults "used to be" in a backwards compatibility kind of way (the first thing I can think of is Kiota, for example). It is not that they can't supprot other dialects, but "bring your own" is necessarily an exercise for the reader.

karenetheridge Feb 5, 2023
Maintainer

However, it seems disingenuous to promise "no breaking changes" while including breaking changes.

I can't agree with this. It's not a contradiction to say "we're making some breaking changes between 2020-12 and this version" and "from here on out, no more breaking changes except in places where we explicitly say something is unstable".

That said, I'm not sure if we can/should promise to make no more breaking changes after draft-next, but we can certainly state that we're aware of the burden involved in doing so and therefore will keep it to a minimum, and only when clearly necessary and beneficial.

gregsdennis Feb 5, 2023
Maintainer Author

Thanks, @karenetheridge, yes we need to "officially declare" our definition somewhere. @jdesrosiers has made a couple efforts toward that, and I think we could use with some sort of "policy" document (in addition to the forthcoming charter) that declares our intentions for the specification.

jdesrosiers · 2023-01-18T20:25:24Z

jdesrosiers
Jan 18, 2023
Maintainer

Using semver semantics, the way I think about it is that we've been operating as releasing major-versions for each draft and patch-versions for clarification/bugfix updates. So, draft-07 is essentially 7.0.0 and the draft-07 update was 7.0.1. What we're talking about here is shifting from releasing major-versions to releasing minor-versions. So 2023 would be something like 10.0.0, 2024 would be 10.1.0, and clarification/bugfix updates to 2024 would be 10.1.1.

So, for option 1, it's like making one last major version release and option 2 is like starting minor-version updates from the last release. The question is, do we want to release 9.1.0 or 10.0.0. Personally, I think it makes perfect sense to go with a new major-version as we start a new way of operating, but mostly I just don't want to start off with the baggage of things we never would have included yet if we knew it would end up getting set in stone. There are known problematic issues. The biggest problem being collecting unknown keywords being incompatible with the forward compatibility guarantees we want to provide. That has to change, which means we need a new major-version anyway (option 1). So, the question isn't "do we change things", but rather, "how much should we change things".

My position is that we change anything that needs to be changed. Anything with known issues should be changed, or marked as experimental, or removed. That should be a relatively small set of changes.

22 replies

gregsdennis Jan 31, 2023
Maintainer Author

this is the one that seems the simplest to me because it does not require the use of vocabularies

Vocabularies are here to stay. This is the mechanism by which JSON Schema can be extended. A lot of work has gone into the feature by multiple people over the past two specifications, and while we're not quite done with it, it's definitely the direction we want to go.

SorinGFS Jan 31, 2023

@gregsdennis

Vocabularies are here to stay. This is the mechanism by which JSON Schema can be extended. A lot of work has gone into the feature by multiple people over the past two specifications, and while we're not quite done with it, it's definitely the direction we want to go.

That was my fear from the very beggining in my quest to decide what I should do next. I'm not a big fan of vocabularies to be honest, but don't understand that I don't appreciate your work. For my small projects, I hope not to be forced to use the vocabularies. If they existed somewhere in the background without me knowing that they exist, they wouldn't be a reason to bother me.

gregsdennis Feb 1, 2023
Maintainer Author

For my small projects, I hope not to be forced to use the vocabularies.

In practice you should only need to "use" vocabularies if you're defining custom keywords. If you're just writing schemas using what's defined in the spec, then vocabs don't impact you at all, really.

SorinGFS Feb 1, 2023

@gregsdennis

That's nice.

I see the potential of vocabularies, basically they transform a bidimensional schema into a multidimensional schema. Having backword compatibility guaranteed, through vocabularies you basically bring the forward compatibility because any added vocabulary will be basically a new version. The versioning system will basically become useless this way. The scary part is the merging system, with schemas having vocabularies and referencing other schemas which also may have vocabularies that may colide for some keywords, I didn't quite understand that part, but as you say the work is still in progress...

westurner Feb 1, 2023

Haven't looked. Re: ~ external vocabularies with XML: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html :

This attack occurs when untrusted XML input containing a reference to an external entity is processed by a weakly configured XML parser.

This attack may lead to the disclosure of confidential data, denial of service, Server Side Request Forgery (SSRF), port scanning from the perspective of the machine where the parser is located, and other system impacts. The following guide provides concise information to prevent this vulnerability

(Edit) https://en.wikipedia.org/wiki/XML_external_entity_attack :

Similar attack vectors apply the usage of external DTDs, external style sheets, external schemas, etc. which, when included, allow similar external resource inclusion style attacks.

Relequestual · 2023-01-19T10:55:59Z

Relequestual
Jan 19, 2023
Maintainer

Not in question: The next publication will include explicit stability guarantees.

Do you mean "let's not discuss this here" or "this is not in question and therefore true"?
If the latter, I'm not sure it has reached consensus.
(I can't tell who upvoted my most recent comment on the Discussion you linked which suggested "proposed stability guarantees" as opposed to "locked-in stability guarantees" for an interium release.

5 replies

gregsdennis Jan 19, 2023
Maintainer Author

If the latter, I'm not sure it has reached consensus.

My understanding is that it has. I haven't seen anyone oppose the idea of guaranteeing stability (in some form).

Relequestual Jan 19, 2023
Maintainer

#282 (comment)

It may not have been totally clear, but I objected providing guarantees for the interim release, and would rather we make signals of what we THINK is extremely likely to be for the following release. I don't feel we are getting wide enough feedback, and this would allow that without compromising on releasing sooner.

If others feel strongly opposed, I'd like to hear why, but I'd bow if that's truly the consensus.

gregsdennis Jan 19, 2023
Maintainer Author

I feel that we have consensus that we want to provide stability guarantees, even at a high level what that means (no more breaking changes), but the crux of the disagreement is when to apply them. Do we promise what we have now won't break with future iterations, or do we publish once more and promise from that point forward?

It sounds to me like you're more in the "maybe break it one last time" camp, but you also want wider feedback from the community.

Relequestual Jan 20, 2023
Maintainer

I think we're making the most educated decisions possible in terms of what we believe to be stable or require further work. I feel the team here is eminently qualified! But, I think that when we disallow additional keywords (which I think we do with the more immediate release), it's going to force people to come out of the woodwork so to speak, and then share their opinions, and that might shed light on previously unseen considerations or inperfections.
Additionally, @benjagm is going to be finding and developing relationships with those individuals and orgs that will be most effected by these types of changes, and really identifying our stakeholders. If we were to set a timeframe for when we wanted to change "potential promises" into "real promises", then I'm confident @benjagm will be able to priorise stakeholder interactions to get the sort of feedback and data we need to be confident.

benjagm Jan 28, 2023
Maintainer

Thanks for the mention @Relequestual. I'll prioritize accordingly.

m50d · 2023-01-31T00:48:41Z

m50d
Jan 31, 2023

Are the stability guarantees expected to disallow the kind of changes that were made in previous draft releases, or merely to formalize the kind of guarantees that were already implicitly being followed? I.e. does 1) mean "the next publication will contain a comparable amount of breakage to 2020-12 and previous drafts, and we commit to doing less breakage in subsequent publications", or does it mean "the next publication will contain unusually large breaking changes compared to 2020-12 and previous drafts; subsequent publications will revert to the normal level of backward and forward compatibility, and augment this with an explicit formal definition of what that means"?

6 replies

m50d Jan 31, 2023

Ideally, the next publication will contain a minimal amount of breakage.

Well that's always the aim. If course 1 was taken and the next publication was 'our last chance to "fix something by breaking it."', would that (in the worst case) mean bigger breaking changes than previous drafts?

(I think @Julian 's point about the implicit stability gurantee is a good one; my view is that making the same kind of breaking changes as in previous draft publications does not break people's implicit expectation (or at least, should not break reasonable people's implicit expectation), but allowing bigger-than-normal breaking changes to "get in before the freeze" would undermine the point of making a stability guarantee in the next publication somewhat)

We are trying to be able to commit to no breaking changes in subsequent publications.

Is this a substantial change compared to what 2020-12 and previous drafts did? Just trying to get a sense of how much this is a change versus a codification of existing practice (e.g. committing to never introducing new keywords feels substantial, but in practice how often were new keywords actually being introduced?)

gregsdennis Jan 31, 2023
Maintainer Author

Is this a substantial change compared to what 2020-12 and previous drafts did?

Currently, the only breaking change that we've identified that we would absolutely need to make is removing "unknown keyword" support, however we have several ideas that will make the transition easier.

my view is that making the same kind of breaking changes as in previous draft publications does not break people's implicit expectation (or at least, should not break reasonable people's implicit expectation)

This point of view is a good one to have. Thank you for articulating it.

Julian Jan 31, 2023
Maintainer

(I think @Julian 's point about the implicit stability gurantee is a good one; my view is that making the same kind of breaking changes as in previous draft publications does not break people's implicit expectation (or at least, should not break reasonable people's implicit expectation)

Man, in not even one sentence I went from making a good point to not being a reasonable person :)

jdesrosiers Jan 31, 2023
Maintainer

If course 1 was taken and the next publication was 'our last chance to "fix something by breaking it."', would that (in the worst case) mean bigger breaking changes than previous drafts?

Nothing is likely to change drastically or without very good reason. We're not looking at this as a chance to fix all the little things that annoy us. There will be some necessary changes to unblock us from evolving the spec without introducing breaking changes. The only impactful change that I think is likely is the one @gregsdennis mentioned, disallowing unknown keywords. But, I don't think that those necessary changes are what this discussion is really about ...

In the past, we haven't been concerned about compatibility between releases and we've included things in the specification that are experimental. The expectation was that we would get feedback from implementation and usage and adjust (including breaking changes) as necessary in later releases. These include dynamic references, the vocabulary system, annotations, and output formats. What we are trying to decide now is really about what to do with those experiments. Should they be considered stable as of their state in 2020-12, or can we continue to treat them as experimental and allow breaking changes to those experimental features in future releases until they are considered stable.

I only expect breakage in these experimental features which should be very low impact.

gregsdennis Jan 31, 2023
Maintainer Author

What we are trying to decide now is really about what to do with those experiments.

No, this discussion isn't about any particular feature or what to do with experiments. It's about whether we can avoid breaking changes for the next release and promise no breaking changes moving forward.

Is it possible for us to promise nothing will break moving forward without including any breaking changes from 2020-12 now?

I'm not concerned about identifying features as experimental or whatever. That discussion and decision can come after this one is resolved.

peterkelly · 2023-01-31T00:52:07Z

peterkelly
Jan 31, 2023

Speaking as a user, please just take whatever you currently have right now and stick with it.

The value of JSON Schema is the ability to use tools and libraries that have been designed around it. Every time a new version with breaking changes is released, this causes significant inconvenience for people who have built production systems around the spec and third-party libraries.

At this point there are no further changes that could be made to the spec whose benefits would outweigh the costs to those who are already using the spec and the ecosystem of software built around it. Declaring JSON Schema to be "complete" and promising not to make any future backwards-incompatible changes is the single most valuable thing that could be done right now.

3 replies

gregsdennis Jan 31, 2023
Maintainer Author

We hear you. This sentiment aligns with @Julian's comment.

However, we need to be cognizant of the impact current features have on our ability to promise compatibility between future releases (my comment).

Relequestual Jan 31, 2023
Maintainer

We hear you loud and clear on this!
We have not tried to do this previously, and currently there are some things which are known "broken", and fixing them is technically breaking.
I think most people agree, we want to fix the problem that tooling developers face, and be in a situation where we can make those promises. However, I'm not convinced we are quite there yet. This is still useful signal to have though, so thanks.

gregsdennis Jan 31, 2023
Maintainer Author

Declaring JSON Schema to be "complete" and promising not to make any future backwards-incompatible changes is the single most valuable thing that could be done right now.

@peterkelly it may be looking like we will be unable to make that promise without breaking some (one or very few) things now. I know it's not ideal, but it may be something we're forced into.

How do you feel understanding that if we include these changes now we will be able to make that promise with subsequent releases? What if we define a migration path and provide tooling to assist?

morphar · 2023-01-31T04:02:43Z

morphar
Jan 31, 2023

Thank you for asking this question.
I use JSON Schema daily and have had to write parsers and validators a couple of times.

As I see it, any breaking changes at this point, will offset adoption of any new standards years and years into the future.
Having to write parsers / validators for Draft 4-7 (yes, Draft 4 is still being used) is annoying as it is.
Adding later versions is a bit easier, since it's more of an addition to the code.

If you make breaking changes now, I would reconsider how I write parsers and validators - should it be one code base or multiple code bases?

One major game-changer, would be to have official schema parsers and validators, in enough languages to cover the majority of popular programming languages.
That is: a parser / validator in C, would be useful in Rust, Python, Go (though native would help adoption), etc.
Given how frontends are written today, a TypeScript version would probably cover use cases in TS / JS land - though again: rather JS as well.
I would assume a Java version would be needed as well.

Performance should of course be considered, but they wouldn't need to be the absolutely most performant libraries out there.
Being there and being correct would be what matters.
More performant libraries would pop up and would benefit from whatever test suites the "official" libraries would use.

Personal note: being forgiven if needed would probably be necessary.
E.g.: in practice: what does a JSON field set to null mean?
If schema says it's a string only, setting it to null is an error.
In many real-world scenarios, it could also mean "deleted" though.

I think that looking at how the Protocol Buffers and Docker Compose specifications works is a good place to start.
They both intoduced breaking changes between version 2 and 3, which was sometimes annoying, but at least you have 1 authoritative interpreter.
If your document works with that, it's correct.

Besides that, having clear semver adherence makes you trust that things "just works".
Go is another example to follow: never break backwards compatibility - rather add new functionality.

I am certain that you could get a lot of contributors to such a project.
I think that the Bytecode Alliance is a great example of how this could play out.
They help ensure a stable way of integrating WebAssembly in many languages.

Sorry for the rant, I guess the conclusion is, that I hope you go with one of 2 options:

Don't make more breaking changes
Create authoritative code bases, adopt semver and break things if needed, between major versions

I personally hope you go with the last option, as I think this would help the adoption of JSON Schema in geneal as well.

1 reply

Relequestual Jan 31, 2023
Maintainer

Part of the challenge around this is finances. Protobuff has huge resources. JSON Schema has more than it ever has, but it's minuscule by compariosn. And, there has been question over if we would even want to have "official" implementations.

What we ARE trying to do, is make finding and determining compliance of implementations easier: https://bowtie-json-schema.github.io/bowtie/
Further, it's easier than ever to upgrade schemas to newer versions: https://github.com/sourcemeta/alterschema

While neither of those two projects are "official" as of right now, I hope they will be.

We really really appreciate you sharing your thoughts. It's a huge challenge to get this level of feedback usually. Please, stay, continue to provide feedback and insight where you can add value =]

awwright · 2023-01-31T04:32:23Z

awwright
Jan 31, 2023
Maintainer

I'm not going to be able to write a reply to each of the posts for the next week at least, but I will quickly try to inspire an appreciation for how nuanced this problem is.

The purpose of proposing a media type through the "Internet-Draft" system is so that implementations can be written, experimented with, and used to produce iterations of the specifications. Although it is common to formalize protocols that are de-facto standards in production, it's not necessarily the case that every protocol (or format/media type) is safe to use in production.

While a feature being used in production "in the wild" generally shouldn't be changed, the only guarantee it will remain stable is the publication of a media type that defines it to work that way.

In the absence of a published media type, a specification can include all the explicit stability guarantees it wants... but if a subsequent publication can change these, then these guarantees aren't meaningful. You have to have a published media type definition.

A published media type can still evolve, though. Evolution is made possible by incorporating "interoperability requirements" — in the form of BCP 14 language that says what implementations "SHOULD" or "MUST" do. When specifications are published that describe de-facto standards in the wild, it's usually made possible by having sufficient interoperability requirements from the start, so the specification can still evolve in the draft process while still being adopted in production.

There's many mechanisms to enable this—CSS has vendor prefixes, HTTP/2 had ALPS identifiers, and HTML and ECMAScript are each powerful enough you can support all sorts of "graceful degradation" techniques. JSON Schema does not yet define requirements like this.

So, the current goal for the specifications is to add interoperability requirements that permit evolution without disrupting older implementations. The essence of this is to prohibit unknown keywords, since unknown keywords may represent an intent by the schema author to reject some set of JSON documents from validation.

It is correct that the addition of this requirement could break some implementations, but (1) it doesn't have to be this way—not all applications actually rely on this guarantee, so it doesn't have to be a requirement a such. What can happen instead is applications that require "strict" handling can specify a different default for unknown properties—instead of unknown keywords accepting by default, they can reject by default, or produce an error by default ("indeterminate result"). I wrote a whitepaper describing this technique, and I encourage everyone here to read it and incorporate the ideas there into your comments.

And (2) when a schema author adds arbitrary keywords, this implies an assumption that the validator they are using will not change. But if you are expecting to upgrade your validator and writing custom keyword names, then unfortunately you've already committed yourself to breakage at some time in the future, and there's nothing that can be done about that—what if we define a new keyword that matches what you've written? We can minimize breakage, but schemas that use nonstandard keywords are flawed from the start and there's no changes to the spec that can be made that can fix that.

Now, what do those interoperability requirements look like?

Here's the top priorities, and then I have several more edits to circulate after these... but first things come first:

I would appreciate comments on these issues.

4 replies

awwright Jan 31, 2023
Maintainer

And to speak to your point @peterkelly, how JSON Schema got to this point is people took the draft and implemented it as a stand-alone library that wasn't intended to evolve. This means that stand-alone applications begun using JSON Schema, but Web/hypermedia/Internet applications that rely on a media type definition have not come unto widespread use yet (though some are in limited aspects, e.g. OpenAPI).

While it's understandable to want to lock compatibility for stand-alone uses, this doesn't satisfy the needs of the Internet community at large, as there's no way to safely exchange JSON Schema documents between validators (by safely, I mean, a schema author can be confident that two validators won't disagree on the validity of a given JSON document).

Relequestual Jan 31, 2023
Maintainer

I think this comment from the posting on Hacker News supports an idea that @awwright previously proposed, specifically allowing for additional keywords outside of those defined ONLY if they include a specific prefix.

SorinGFS Jan 31, 2023

@Relequestual
This would be better than nothing, but ugly one.... 😄
Personally I would prefer the oposite: instead of marking extra keywords like this:

{
    "type": "string",
    "@foo": "bar"
}

to mark keywords acording what they are, like this:

{
    "@type": "string",
    "foo": "bar"
}

or even better, to invent the term operator and to wrap required params into every specific operator, for example validation would become:

{
    "validate": {
        "type": "string",
        "format": "email"
    },
    "foo": "bar"
}

... which after validation process would become:

{
    "type": "string",
    "format": "email",
    "foo": "bar"
}

Btw, I would prefer the term attribute instead of keyword in terminology, because keyword is too generic. The term attribute would introduce a distinction between varius use cases of the generic term keyword.

gregsdennis Jan 31, 2023
Maintainer Author

The idea of a prefix for unknown keywords was presented in https://github.com/orgs/json-schema-org/discussions/241, and I'd prefer that discussion stayed there.

We need to focus on whether we are able to promise no more breaking changes without first breaking something.

Relequestual · 2023-01-31T14:03:17Z

Relequestual
Jan 31, 2023
Maintainer

This comment on HN (and some others) suggest that people are less concerned with breaking changes if there is tooling to easily upgrade their schemas. @jviotti created and @gregsdennis collaborated on AlterSchema, which does just that.

If we break things one last time, providing an upgrade path for schema authors will ease that pain.

4 replies

jviotti Jan 31, 2023
Collaborator

Alterschema has some edgy limitations (like not fixing up references to parts of the schema that do not exist anymore) but I'm working on a revamp that rethinks some of the approach to make it even more robust.

jviotti Jan 31, 2023
Collaborator

Also, while I agree that users of JSON Schema are fine as long as we provide an automated upgrade path, it's tooling makers that suffer the most. Updating non-trivial JSON Schema tooling to support another version of the specification is often not an easy ride.

gregsdennis Jan 31, 2023
Maintainer Author

Updating non-trivial JSON Schema tooling to support another version of the specification is often not an easy ride.

This strikes at the heart of the issue. By promising that there will be no more breaking changes, we make the process to support newer versions exponentially simpler.

The question for this discussion, then, is can we (is it possible to) make that promise without first including one last set of breaking changes?

Relequestual Feb 1, 2023
Maintainer

I think we're coming (I know I am) to the conclusion that we cannot. I think the only way we can is by making one last set of breaking changes.

westurner · 2023-02-01T12:47:37Z

westurner
Feb 1, 2023

On Tue, Jan 31, 2023, 3:32 PM Greg Dennis ***@***.***> wrote: Updating non-trivial JSON Schema tooling to support another version of the specification is often not an easy ride. This strikes at the heart of the issue. By promising that there will be no more breaking changes, we make the process to support newer versions exponentially simpler. The question for this discussion, then, is *can* we (is it possible to) make that promise without first including one last set of breaking changes?

No, that's a pipe dream. Specs tend to change over time. When you have breaking changes, there's a deprecation schedule for the unfortunately-namespace-versioned API, and you increment the MAJOR field of a.n. e.g. SemVer MAJOR.MINOR.PATCH so that people can tell *from the version number* that there could be breaking changes which requires fixing their implementation. For example, SPARQL 1.2 etc. are revised to support RDFstar and SPARQLstar years later; after having been thought frozen. w3c/rdf-star-wg#4 (comment) "Call for breaking changes before this funded major revision" would be more realistic IMHO Because e.g. https://github.com/lexiq-legal/pydantic_schemaorg isn't sufficient to describe all of the Preconditions and Postconditions, it's probably safe to say that developers will need unspecified custom keywords or that they need yet another schema document with the same data shapes.

2 replies

Relequestual Feb 1, 2023
Maintainer

Are you saying "No, there must be one more set of breaking changes", or "No, you can't make forever compatability promises"?

gregsdennis Feb 1, 2023
Maintainer Author

Having a deprecation schedule is one of the changes we're looking at making. But it doesn't make sense to do so outside of some sort of stability guarantee.

Yes, we want to never have a breaking change between releases again.

sirosen · 2023-02-01T16:47:55Z

sirosen
Feb 1, 2023

Something relevant but not covered here: How frequently are new drafts going to be published?

If I were a spec author, I would be concerned about any future-compatible promise which I am not, as an author, ever allowed to break. Especially because it can be hard to define what a compatible change is.

What about making the future-compatibility guarantee only ever one or two steps into the future?
Because JSON Schema drafts are relatively infrequent, I'm suggesting language akin to the following

The following features of JSON Schema are guaranteed to be future-compatible across all future spec versions:

"$schema" interpretation when "$schema" is a string

...

All other features of JSON Schema are guaranteed to be maintained in the next spec version. They may change in future specs.

2 replies

sirosen Feb 1, 2023

By the way, this suggestion is informed by my experience working/living with CPython, which is currently being very good, IMO, about having a clear deprecation schedule which is legible to developers. I always know when things are being removed from the stdlib at least two releases ahead of when it happens.

Of course, that's software and this is a spec. So there are major and important differences.

But one thing I've noticed is that if you promise me compatibility with tomorrow's version, it's still worth quite a lot.

gregsdennis Feb 1, 2023
Maintainer Author

If I were a spec author, I would be concerned about any future-compatible promise which I am not, as an author, ever allowed to break.

We, as spec authors, are aware of the seriousness of what we're trying to promise. Making this promise will force us to very carefully consider when a feature is considered stable. Please have a read through json-schema-org/json-schema-spec#1368 and json-schema-org/json-schema-spec#1348, which ultimately led to this discussion. We're definitely considering as much as possible.

There are many examples of technologies which progress without breaking changes, and we don't think that this is an unreachable goal.

karenetheridge · 2023-02-05T00:27:08Z

karenetheridge
Feb 5, 2023
Maintainer

Unless I missed it, the definition of "breaking change" hasn't been clearly defined. Is it a breaking change if $dynamicRef in the draft-next vocabulary works slightly differently than the one in draft-2020-12? After all, they are different vocabularies; no one is claiming that they are the same thing.

Is it a breaking change if draft-next's base metaschema includes a new vocabulary that defines a keyword that might overlap with a keyword that appears in someone's schema where they were assuming that it's purely annotative and otherwise ignored? After all, if they're requesting evaluation under "$schema": ".../draft-next", that's not the same as under "$schema": "..2020-12". Should they expect that when the $schema keyword changes there are some behavioural differences?

We're already making breaking changes in draft-next: for example, the introduction of propertyDependencies which someone could have been using as an annotation, but will now start producing errors on some content.

Extending that thought some more -- I don't think we can possibly consider the addition of new keywords as "breaking", even if/when we declare that unknown keywords are prohibited by default. The way to handle unknown keywords in a future draft would be to declare a vocabulary that defines them (with a schema of true, indicating that all values are valid, and which will satisfy the unevaluatedProperties: false in the base metaschema), but if we add that same keyword property to an existing vocabulary that's part of the default metaschema, there can still be errors produced by that vocabulary -- all evaluation is additive, and we cannot take away an error that something else has already produced.

Therefore - we need to be clear about what constitutes a "breaking change".

And after that, I think I am in favour of declaring that we'll (try to) avoid making such changes, but I want to know exactly what I'm agreeing to, first.

4 replies

gregsdennis Feb 5, 2023
Maintainer Author

Unless I missed it, the definition of "breaking change" hasn't been clearly defined. Is it a breaking change if $dynamicRef in the draft-next vocabulary works slightly differently than the one in draft-2020-12? After all, they are different vocabularies; no one is claiming that they are the same thing.

The complaints that we've received is that features don't work the same across JSON Schema releases. This means that users generally do consider them to be the same thing. (We have a deeper understanding of how JSON Schema works, so we see the technical nuance of them being different, but most users don't.) So for a keyword to change how it works from one version of the vocabulary to the next version of the vocabulary is indeed considered a breaking change.

What we're defining a breaking change to be is reasonable compatiblity between specification releases. If an instance validates against a schema under version X rules, then it should validate against that same (unchanged) schema under version X+1 rules. Historically, this hasn't been the case.

We do have some ideas for feature deprecation that we need to work out (elsewhere), and there are a few caveats around supporting unforeseen or edge-case scenarios, but on the whole we want this instance-validation stability across spec versions.

As an example of the "reasonable compatibility" mentioned above, $dynamicRef is losing its bookending requirement. Technically this is a breaking change since there exist schemas that would change behavior, but from questions we've received about the keyword, we can say that those schemas would change to the expected behavior, so we're considering it more of a bug fix.

We're already making breaking changes in draft-next: for example, the introduction of propertyDependencies which someone could have been using as an annotation, but will now start producing errors on some content.

Yes, this has been addressed by the "unknown keyword" arguments in the discussion above.

I don't think we can possibly consider the addition of new keywords as "breaking", even if/when we declare that unknown keywords are prohibited by default.

Yes, this is the idea. Conversely, if we don't prohibit unknown keywords, then we absolutely must consider the addition of new keywords as "breaking."

The discussion of how to prohibit them without causing too much chaos is happening here.

jdesrosiers Feb 6, 2023
Maintainer

Unless I missed it, the definition of "breaking change" hasn't been clearly defined.

The following was my attempt to define compatibility. (json-schema-org/json-schema-spec#1368)

Compatibility is defined with respect to the true/false validation result of a schema. If an instance is valid or invalid against a schema according to one release, all other releases including future releases MUST define the same validation result or define the result to be indeterminate. An indeterminate result is neither valid nor invalid.

Is it a breaking change if $dynamicRef in the draft-next vocabulary works slightly differently than the one in draft-2020-12? After all, they are different vocabularies; no one is claiming that they are the same thing.

Agreed. Technically, no changes have been breaking changes because every release is a distinct new version, not a change to the previous version. You can't have breaking changes if nothing ever changes. The idea is that going forward instead of releasing a whole new version with every release, releases are an update to the long-lived stable version. A keyword in the 2024 specification is the same keyword in the same vocabulary in the same dialect as it was in 2023. That's why any changes need to be compatible. The question being posed in this discussion is, do we start doing updates starting from 2020-12 or do we introduce one last new version (2023) and start doing updates from there.

Should they expect that when the $schema keyword changes there are some behavioural differences?

Yes, I think they should. I also don't think dialect URI should change with spec updates, only new versions. If changes are always compatible, it shouldn't be a problem to update schemas rather than release a whole new set with each release. I believe it's possible for us to never have another breaking change release after 2023, but if we absolutely have to break something, that's the only time I think the dialect URI should need to change.

jdesrosiers Feb 6, 2023
Maintainer

This means that users generally do consider them to be the same thing.

Actually, I think users do know that features aren't necessarily the same across versions. They know that upgrading from one version to the next is at best a chore and and at worst liable to break their code. Moving to releases that update in a compatible way rather than create a new potentially breaking version allows for a safe and painless way for users to make use of new features and stay up-to-date.

gregsdennis Feb 7, 2023
Maintainer Author

I think users do know that features aren't necessarily the same across versions.

I agree that they know they're not the same, and that's the source of the complaints. They expect them to be the same across versions, and they're not.

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment has been hidden.

Sign in to view

This comment has been hidden.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment has been hidden.

Sign in to view

This comment has been hidden.

Sign in to view

This comment was marked as off-topic.

Sign in to view

This comment has been hidden.

Sign in to view

This comment has been hidden.

Sign in to view

Aligning views on stability guarantees, their implications, and community perception #295

gregsdennis Jan 17, 2023 Maintainer

Replies: 15 comments · 65 replies

Julian Jan 17, 2023 Maintainer

jdesrosiers Jan 18, 2023 Maintainer

mwadams Jan 19, 2023 Collaborator

karenetheridge Feb 5, 2023 Maintainer

gregsdennis Feb 5, 2023 Maintainer Author

jdesrosiers Jan 18, 2023 Maintainer

gregsdennis Jan 31, 2023 Maintainer Author

gregsdennis Feb 1, 2023 Maintainer Author

Relequestual Jan 19, 2023 Maintainer

gregsdennis Jan 19, 2023 Maintainer Author

Relequestual Jan 19, 2023 Maintainer

gregsdennis Jan 19, 2023 Maintainer Author

Relequestual Jan 20, 2023 Maintainer

benjagm Jan 28, 2023 Maintainer

This comment was marked as off-topic.

This comment was marked as off-topic.

gregsdennis Jan 31, 2023 Maintainer Author

Julian Jan 31, 2023 Maintainer

jdesrosiers Jan 31, 2023 Maintainer

gregsdennis Jan 31, 2023 Maintainer Author

gregsdennis Jan 31, 2023 Maintainer Author

Relequestual Jan 31, 2023 Maintainer

gregsdennis Jan 31, 2023 Maintainer Author

Relequestual Jan 31, 2023 Maintainer

awwright Jan 31, 2023 Maintainer

awwright Jan 31, 2023 Maintainer

Relequestual Jan 31, 2023 Maintainer

gregsdennis Jan 31, 2023 Maintainer Author

This comment was marked as off-topic.

This comment has been hidden.

This comment was marked as off-topic.

This comment has been hidden.

This comment was marked as off-topic.

This comment has been hidden.

Relequestual Jan 31, 2023 Maintainer

jviotti Jan 31, 2023 Collaborator

jviotti Jan 31, 2023 Collaborator

gregsdennis Jan 31, 2023 Maintainer Author

Relequestual Feb 1, 2023 Maintainer

Relequestual Feb 1, 2023 Maintainer

gregsdennis Feb 1, 2023 Maintainer Author

gregsdennis Feb 1, 2023 Maintainer Author

karenetheridge Feb 5, 2023 Maintainer

gregsdennis Feb 5, 2023 Maintainer Author

jdesrosiers Feb 6, 2023 Maintainer

jdesrosiers Feb 6, 2023 Maintainer

gregsdennis Feb 7, 2023 Maintainer Author

gregsdennis
Jan 17, 2023
Maintainer

Replies: 15 comments 65 replies

Julian
Jan 17, 2023
Maintainer

jdesrosiers Jan 18, 2023
Maintainer

mwadams Jan 19, 2023
Collaborator

karenetheridge Feb 5, 2023
Maintainer

gregsdennis Feb 5, 2023
Maintainer Author

jdesrosiers
Jan 18, 2023
Maintainer

gregsdennis Jan 31, 2023
Maintainer Author

gregsdennis Feb 1, 2023
Maintainer Author

Relequestual
Jan 19, 2023
Maintainer

gregsdennis Jan 19, 2023
Maintainer Author

Relequestual Jan 19, 2023
Maintainer

gregsdennis Jan 19, 2023
Maintainer Author

Relequestual Jan 20, 2023
Maintainer

benjagm Jan 28, 2023
Maintainer

gregsdennis Jan 31, 2023
Maintainer Author

Julian Jan 31, 2023
Maintainer

jdesrosiers Jan 31, 2023
Maintainer

gregsdennis Jan 31, 2023
Maintainer Author

gregsdennis Jan 31, 2023
Maintainer Author

Relequestual Jan 31, 2023
Maintainer

gregsdennis Jan 31, 2023
Maintainer Author

Relequestual Jan 31, 2023
Maintainer

awwright
Jan 31, 2023
Maintainer

awwright Jan 31, 2023
Maintainer

Relequestual Jan 31, 2023
Maintainer

gregsdennis Jan 31, 2023
Maintainer Author

Relequestual
Jan 31, 2023
Maintainer

jviotti Jan 31, 2023
Collaborator

jviotti Jan 31, 2023
Collaborator

gregsdennis Jan 31, 2023
Maintainer Author

Relequestual Feb 1, 2023
Maintainer

Relequestual Feb 1, 2023
Maintainer

gregsdennis Feb 1, 2023
Maintainer Author

gregsdennis Feb 1, 2023
Maintainer Author

karenetheridge
Feb 5, 2023
Maintainer

gregsdennis Feb 5, 2023
Maintainer Author

jdesrosiers Feb 6, 2023
Maintainer

jdesrosiers Feb 6, 2023
Maintainer

gregsdennis Feb 7, 2023
Maintainer Author