Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

removes the use of integer type from registries format #4015

Merged
merged 7 commits into from
Aug 27, 2024

Conversation

baywet
Copy link
Contributor

@baywet baywet commented Aug 13, 2024

ralfhandl
ralfhandl previously approved these changes Aug 13, 2024
@ralfhandl ralfhandl requested a review from a team August 13, 2024 17:03
Copy link
Member

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

integer is not a type in the JSON data model. This is explicitly addressed in JSON Schema, and is why integer is never supposed to be present in base_type.

Both keyword and format applicability work with JSON types, not the type keyword values (type: integer is more like a format, except that it's actually reliable).

@baywet
Copy link
Contributor Author

baywet commented Aug 13, 2024

@handrews I did come across this section of the specification, this other seems to contradict it though. (Link in my original post)

String values MUST be one of the six primitive types ("null", "boolean", "object", "array", "number", or "string"), or "integer" which matches any number with a zero fractional part.

Not sure which one is correct, but it seems at least some descriptions in the wild are using this type.

Also sf-integer in the registry here mentions this type.

@handrews
Copy link
Member

@baywet sf-integershouldn't do that, I must have missed it when reviewing.

The sentence you quoted does not contradict anything- it expliclty states six primitive types, then lists the six, and then says "or integer which matches any number with a zero fractional part", meaning that both 1 and 1.0 are integers.

The JSON RFC does not define integers as distinct from numbers, so JSON parsers do not necessarily make any distinction (in part because JavaScript does not either).

This was a significant point of discussion at one point in JSON Schema's history and I am very certain of what it means.

@ralfhandl
Copy link
Contributor

Before approving this PR I wondered whether integer is an allowed value for type, then I found in OAS 3.1.0 Section 4.4 Data Types that

Note that integer as a type is also supported and is defined as a JSON number without a fraction or exponent part.

and

The formats defined by the OAS are:

type format Comments
integer int32 signed 32 bits
integer int64 signed 64 bits (a.k.a long)

So @baywet's changes to int32 and int64 are explicitly sanctioned, and the other three are consistent with the current specification text.

@baywet
Copy link
Contributor Author

baywet commented Aug 14, 2024

So @baywet's changes to int32 and int64 are explicitly sanctioned, and the other three are consistent with the current specification text.

I could revert the changes to uint etc to only include int32 and int64 to stop the bleeding here. But I guess if we did so, I'd have to leave a note. Plus what should we do with sf-integer then?

@ralfhandl
Copy link
Contributor

I could revert the changes to uint etc

No, int16, int8, and uint8 are consistent extrapolations of int32 and int64 and should follow the same rules: can be used with both types number and integer.

We could deprecate integer with 3.2.0 and remove it with Moonwalk 😄

@baywet
Copy link
Contributor Author

baywet commented Aug 14, 2024

We could deprecate integer with 3.2.0 and remove it with Moonwalk

In that scenario, should we add a note in the registry for all the formats that mention the integer type?

mikekistler
mikekistler previously approved these changes Aug 14, 2024
Copy link
Contributor

@mikekistler mikekistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. 👍

I think it is pretty clear that integer is allowed/supported in OpenAPI, so these changes are appropriate. Further, I think we'd need to have compelling reasons to remove it in any future version.

@handrews
Copy link
Member

@ralfhandl

So @baywet's changes to int32 and int64 are explicitly sanctioned, and the other three are consistent with the current specification text.

This is not about OAS, this is about JSON Schema. JSON Schema's format keyword does not distinguish between integers and non-integer numbers when determining whether a given format applies to a given instance. Putting integer in the format registry like this isn't going to change that, and misleads people to expect functionality that is not present.

@handrews
Copy link
Member

To try and make this a bit more clear, JSON Schema keywords and format values have a notion of applicability: to which JSON Schema data model types does a given keyword or format apply? If the keyword or format does not apply, then it always produces a true (passing) validation result.

As I noted in a previous comment, the JSON Schema data model does not include integers. The type keyword is not what defines the data model type. An instance value has a data model type regardless of what keywords are present or absent.

If we were to treat integer as a data model type (which it is not), and we were to say that format: uint8 only applies to instances of type integer, then the schema format: uint8 would consider the instance 1.1 to be valid, because it would not be an integer and therefore the format would not apply. That is not what we want. Which is fine because it's not possible anyway.

Consider 1.0. JSON Schema defines it as an integer, because it defines integers mathematically. Some people expect 1.0 to be a non-integer because its textual representation contains a decimal point. But this is not how JSON Schema works. 1.0 is a place where trying to make a distinction between integers and non-integers is problematic because people expect their programming language's conventions for numeric literals to apply, which is, again, not how JSON Schema works.

@baywet
Copy link
Contributor Author

baywet commented Aug 16, 2024

Thank you for the super detailed description of how things are supposed to work. I had no idea about the data model aspects and its subtilities.

I find it odd that JSON schema went with this opinionated way of dealing with numbers, which is different from JSON as far as I understand and loses information. When you have 1.0 as opposed to 1, it carries additional information that is the level of precision which is to the first decimal.

In that context, it would make sense to me that an API producer would want to say this API only produces information with this or that level of precision. And consequently that somebody would want to validate the data it's receiving when compared with the spec.

But I understand this is not how JSON schema was designed to work and trying to make it work this way would be difficult because of that.

Now, to Ralph's earlier comment. It would make sense that we clean this up moving forward.
I think this is a good candidate to have the table removed from 3.2.0. and I can update this pull request to remove any reference to the integer type for the time being and maybe we can even add a note or link to this issue/pull request for people who want to know why it has changed. What does everybody think about this solution?

@ralfhandl ralfhandl self-requested a review August 16, 2024 12:35
@mikekistler
Copy link
Contributor

I can update this pull request to remove any reference to the integer type for the time being

I don't think that's a good idea. Integer is a type in OAS, and that might have some peculiarities, but there's lots of stuff in the spec that is peculiar.

@ralfhandl
Copy link
Contributor

I'm not quite sure what we are arguing about here.

My take on the current specification and format registry entries:

  • Spec

    integer as a type is also supported and is defined as a JSON number [...]

  • Format int32

    int32 - signed 32-bit integer
    Base type: number.
    The int32 format represents a signed 32-bit integer, with the range −2,147,483,648 through 2,147,483,647.

Since integer implies number, I can combine format: "int32" with both type: "number" and type: "integer".

An OpenAPI-aware payload validator that supports format: "int32" will produce the same validation result in both cases.

Do we need to explicitly list integer in the format's base type?

  • No, because number implies that the format can also be used with integer.

So the argument seems to be:

  • Does it harm to list both number and integer as a base type in the registry?

@baywet
Copy link
Contributor Author

baywet commented Aug 16, 2024

Does it harm to list both number and integer as a base type in the registry?

Which was my original thought process as well. To which I'm thinking "no it doesn't harm". But I'm ok if we say we need to clean it up instead. I just want us to be consistent.

@handrews
Copy link
Member

@ralfhandl @baywet It's objectively wrong according to the JSON Schema specification to list "integer" as a data model type for the purpose of keyword or format applicability. Applicability of keywords and formats is only determined by the six data model types: number, string, boolean, null, object, and array. Anything else is not in compliance with JSON Schema.

@ralfhandl
Copy link
Contributor

@handrews You are absolutely right from a pure JSON Schema perspective newer than Draft 04.

OAS 3.0 and 3.1 (probably due to the removal of integer in JSON Schema Draft 05) explicitly add integer back as an allowed type in Section 4.4 Data Types, adding to the difference between the OAS Schema Object and pure JSON Schema.

I've opened #4038 to discuss and fix this.

@mikekistler
Copy link
Contributor

@handrews I am unclear what the issue is here. Is it:

  • that "integer" is recognized as a valid value for "type" in OAS?
  • that the OAS format registry defines values for "format", their meanings, and which "type" values they may be used with?
  • that the OAS format registry associates some "format" values with the "integer" type value?

Or is it something else entirely?

@handrews
Copy link
Member

@ralfhandl @mikekistler I have tried to explain again in #4038.

I don't know how to be more clear about this. The set of types that are data model types and the set of values for the type keyword are not the same. The OAS cannot change the set of data model types, and that is the set that JSON Schema uses to determine keyword and format value applicability.

We cannot change that. It is not part of our spec. It is part of JSON Schema. It doesn't matter what we say, it's not going to change the meaning of JSON Schema.

@baywet
Copy link
Contributor Author

baywet commented Aug 19, 2024

The set of types that are data model types and the set of values for the type keyword are not the same

Why are they different? Is integer the only difference here? In which case, why was it introduced?

@handrews
Copy link
Member

@baywet it's literally in the text you quoted elsewhere:

As an example, "integer" is a reasonable type for a vocabulary to define as a value for a keyword, but the data model makes no distinction between integers and other numbers.

This explicitly states that it is not part of the data model, it's just a convenience for a specific keyword.

@mikekistler
Copy link
Contributor

@handrews I am not familiar with the concept of "data model types" in JSON Schema. Can you point to the part of the JSON Schema spec that describes this, so I can educate myself?

@baywet
Copy link
Contributor Author

baywet commented Aug 19, 2024

@baywet it's literally in the text you quoted elsewhere:

As an example, "integer" is a reasonable type for a vocabulary to define as a value for a keyword, but the data model makes no distinction between integers and other numbers.

This explicitly states that it is not part of the data model, it's just a convenience for a specific keyword.

Allow me to rephrase my question. You probably have the most historical context among us here over why JSON schema has this integer type.
Assuming this answers my question of "what's the Delta between the two sets", I'd like to understand why integer is a thing in JSON schema to begin with? Since it seems to be adding more confusion than value. It'd probably have been easier to match the two sets for everyone involved.

@baywet
Copy link
Contributor Author

baywet commented Aug 19, 2024

@handrews I am not familiar with the concept of "data model types" in JSON Schema. Can you point to the part of the JSON Schema spec that describes this, so I can educate myself?

@mikekistler this is what is being referred to here I believe. https://json-schema.org/draft/2020-12/json-schema-core#name-instance-data-model

@ralfhandl
Copy link
Contributor

Adjusted the other markdown file.

@baywet
Copy link
Contributor Author

baywet commented Aug 22, 2024

LGTM! (ralf's latest changes)
Just as a reminder, I do not have merge permissions, so I'll need somebody else to merge for me :)

Copy link
Contributor

@mikekistler mikekistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a short explanation of the purpose of the Type column, and I think "Instance Type" is a slightly better term.

registry/format.md Show resolved Hide resolved
registry/format.md Show resolved Hide resolved
@ralfhandl
Copy link
Contributor

I think "Instance Type" is a slightly better term

@mikekistler Then why not use "Instance Type" also in #4045?

@handrews
Copy link
Member

As noted in #4045, I don't think most people know what an "instance" is, so "JSON Data Type" would be better.

@baywet
Copy link
Contributor Author

baywet commented Aug 22, 2024

alright, I reverted back to JSON data type everyone. Final review and merge?

registry/format.md Outdated Show resolved Hide resolved
_includes/format-entry.md Outdated Show resolved Hide resolved
@ralfhandl
Copy link
Contributor

@baywet Almost there 😎

handrews
handrews previously approved these changes Aug 22, 2024
@ralfhandl
Copy link
Contributor

@mikekistler Please check, and if you approve, please merge. Thanks!

@baywet
Copy link
Contributor Author

baywet commented Aug 22, 2024

Thanks everyone! now we just need @mikekistler to approve.

@ralfhandl
Copy link
Contributor

@mikekistler Could you please re-review?

The wording is now in line with #4045

Copy link
Contributor

@mikekistler mikekistler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

@ralfhandl ralfhandl merged commit 1b22948 into OAI:gh-pages Aug 27, 2024
@baywet
Copy link
Contributor Author

baywet commented Aug 27, 2024

Thanks! We're ready to merge. Who has the permissions to do so?

@ralfhandl
Copy link
Contributor

We're ready to merge. Who has the permissions to do so?

Me, already merged.

Thanks for all the work you put into this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Explicitly mention type: "integer" in definition of Schema Object
5 participants