multipleOf and floating point rounding errors #312

cederlys · 2017-04-26T11:47:48Z

Is -15.9 a multiple of 5.3? The current specification of JSON schema is a bit terse:

A numeric instance is only valid if division by this keyword's value results in an integer.

What does this mean? In some programming languages, dividing a floating point number by another floating point number always results in a floating point number. In this case, it would be -3.0, which isn't an integer, so the validation would always fail.

python-jsonschema/jsonschema#185 is a bug report about this issue in a schema validator implementation. The conclusion is that "this is just floating points. Those numbers aren't exactly representable as floats, so you're going to get False, there's nothing jsonschema can do about it, the numbers you get are not in fact multiples of each other."

I think the specification needs to be clearer. Is this supposed to be useful for numbers like 5.3 and -15.9 which often cannot be represented exactly in floating point form? If so, the specification needs to be clear that implementations that use floating point needs to deal with rounding errors. In the current state, we get interoperability issues.

Relequestual · 2017-04-26T11:58:38Z

What is the spec unclear about? How the library decides to do maths is up to it. I don't know if we have any specific tests for floating point values for multipleOf. @Julian, @epoberezkin ?

cederlys · 2017-04-26T12:13:09Z

Since the spec doesn't explicitly say that multipleOf is expected to work even for numbers that cannot be represented exactly as floating point numbers, some implementors just give up and says, in essence, "you can't compare floating point numbers, because of rounding errors". This makes multipleOf an interop nightmare.

I see a few ways of handling this issue:

declare that multipleOf only works on integers (this would probably cause even more breakage, and is not something I recommend)
explicitly say that validators must deal with rounding issues, and give a few examples
explicitly say that the result when using floating point numbers may be unexpected unless the numbers can be represented exactly in whatever the implementation of the validator uses to store floating point numbers (I don't recommend this either)

Julian · 2017-04-26T13:52:15Z

@cederlys what do you mean by "deal with" rounding errors?

Something like your last option is the current state. But it's not JSON Schema specifically that made it, JSON does not mandate that languages parse into arbitrary precision, and many languages don't have easy access to such a thing.

It's true that that makes things less portable, but I'm not sure what motivation JSON Schema would have to be more strict there -- in cases where you control all the pieces, you have a choice on whether to use arbitrary precision, as I mentioned in that ticket, and when you don't, yeah you need to deal with the fact that your schema means different things depending on how someone deals with the resulting JSON.

Relequestual · 2017-04-26T14:04:46Z

Maths is a fundamental issue between some languages. If you have an issue with a specific implementation of JSON Schema, the issue is with the implementation, I feel.

cederlys · 2017-04-26T15:04:20Z

One way to deal with this issue is something like this (in pseudocode):

// Return true if dividends is a multiple of divisor.
// Actually, return true if it is almost a multiple of divisor,
// to account for rounding errors.
bool is_multiple_of(float dividend, float divisor)
{
    if (dividend == 0)
        return true; // avoid division by zero when computing scaled_diff
    float quotient = dividend / divisor;
    float rounded = round_to_nearest_integer(quotient);
    float scaled_diff = abs(dividend - divisor * rounded) / dividend;
    if (scaled_diff < epsilon)
        return true;
    else
        return false;
}

The value of epsilon depends on the floating point implementation. It should be choosen so that rounding errors don't case false failures, but it should be as small as possible to avoid false positives.

I think it would be helpful if JSON Schema explicitly states if implementations are supposed to go to this trouble, or if using floating point numbers is expected to be non-portable.

Julian · 2017-04-26T15:22:38Z

That kind of thing can never work -- see the response in the bug ticket you linked, although I was quite terse there unfortunately.

How are you going to distinguish what you call "rounding errors" from the actual literal float that is not the "rounded" one you're talking about?

Are you proposing that JSON Schema mandate some level of imprecision that is different from the float specification's own? If so, can you elaborate on why that'd be a thing that's in JSON Schema's purview to want to do?

cederlys · 2017-04-27T09:14:53Z

I'm not saying that JSON Schema should require implementations to do like that. I'd be just as happy if the spec had a footnote that said something like this:

Implementations of JSON Schema validators may store numbers in any form they like. If they use a binary floating point format, it may not be possible to store an exact representation of numbers such as 0.01. This means that for instance 0.99 may not be a multiple of 0.01. In practice, multipleOf works well for small to mid-size integers, and fractions that can be exactly represented in binary form (such as 0.5 and 0.25), but may produce surprising results for other numbers.

Perhaps this should be mentioned in the JSON specification, but the issue isn't as important there, as the JSON format itself doesn't do any math. It says nothing about how a number should be stored by an application. In the JSON specification, a number is just a sequence of characters that adheres to a particular grammar. But in JSON Schema validators have to actually do math with the numbers when multipleOf is used. Because of that, I think it is up to JSON Schema to either define, or explicitly leave it undefined, how that math is performed.

I may be wrong, but I have not found anything that requires an implementation to use binary floating point internally. If an implementation were to use floating point operations on decimal numbers it wouldn't have this issue. But that is probably not something that should be required.

Julian · 2017-04-27T12:22:13Z

Ah, yeah, a note certainly makes sense to me.

Reminding people next to multipleOf that its use with non-integer numbers may not be portable and will often involve floating point error depending on the host language's parsing behavior sounds like a reasonable idea.

The upcoming (sidebar: @handrews is this upcoming or released, I can't tell, the website claims draft 5 is current) draft 6 doesn't appear to have much difference in explaining multipleOf from how I remember it, but it seems reasonable to me to add something like that note going forward if someone can come up with a decent terse wording.

cederlys · 2017-04-28T07:06:50Z

Maybe something like this? I've borrowed heavily from RFC 7159, chapter 6, but tried to adapt it for the current context:

This specification allows implementations to set limits on the range and precision of numbers accepted. Since software that implements IEEE 754-2008 binary64 (double precision) numbers [IEEE754] is generally available and widely used, good interoperability can be achieved by JSON Schemas that expect no more precision or range than these provide. A schema such as {"type": "number", "multipleOf": 0.01} may be problematic, since 0.01 cannot be represented exactly in many binary floating point implementations; in some implementations 0.49 may not be accepted as a multiple of 0.01.

Note that when such software is used, numbers that are integers and are in the range [-(2**53)+1, (2**53)-1] are interoperable in the sense that implementations will agree exactly on whether one integer is a multiple of the other.

Unless it is already present, the IEEE754 reference must also be added as an informative reference:

[IEEE754]  IEEE, "IEEE Standard for Floating-Point Arithmetic", IEEE
          Standard 754, August 2008,
          <http://grouper.ieee.org/groups/754/>.

(I have not checked if that standard has been updated after its inclusion in RFC 7159.)

awwright · 2017-04-29T21:58:15Z

Since JSON already talks about how to parse its arbitrary-precision numbers as IEEE floats, and since JSON is normatively referenced (making it a part of the spec in a sense), I don't think any additional text is actually warranted.

If implementations want to use IEEE floats, they're very much allowed to, and IEEE already treats how to do number comparisons using an acceptable-margin-of-error technique. Do we need to describe that again?

Also not that the precision of an IEEE float is proportional to its magnitude, so even if multipleOf had to be a float, that would only work up to some (very large, but finite) number.

cederlys · 2017-05-11T18:59:02Z

I don't have access to the IEEE standard. But if IEEE treats how do compare numbers using an acceptable-margin-of-error technique -- does that not imply that a JSON Schema implementation that uses IEEE should consider 0.49 to be a multipleOf 0.01? Is that what you meant, @awwright? And yet, @Julian seems to be of the opposite view: when floating point is used, you should expect unexpected results, and 0.49 may not be a multiple of 0.01.

I think either view is valid. But they cannot both be valid at once. I think the JSON Schema needs to explicitly state what we (as schema writers and users) can expect of a validator.

I found a very good article about comparing floating point numbers: https://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/

If the method suggested in that article is used to compare round(x/y) to x/y, it should produce sensible results.

But is that something that JSON Schema should require of validators?

Julian · 2017-05-11T19:02:08Z

I have to read both your comment and Austin's again to make sure I understand the nuance, but to be clear, my position was not "all bets are off", much more "expect the behavior defined by the float spec", so if I've erred in what that is, yeah, that'd be what I was going for. Will read this a bit more carefully in a bit. On May 11, 2017 2:59 PM, "Per Cederqvist" <[email protected]> wrote: I don't have access to the IEEE standard. But if IEEE treats how do compare numbers using an acceptable-margin-of-error technique -- does that not imply that a JSON Schema implementation that uses IEEE should consider 0.49 to be a multipleOf 0.01? Is that what you meant, @awwright <https://github.com/awwright>? And yet, @Julian <https://github.com/julian> seems to be of the opposite view: when floating point is used, you should expect unexpected results, and 0.49 may not be a multiple of 0.01. I think either view is valid. But they cannot both be valid at once. I think the JSON Schema needs to explicitly state what we (as schema writers and users) can expect of a validator. I found a very good article about comparing floating point numbers: https://randomascii.wordpress.com/2012/02/25/comparing- floating-point-numbers-2012-edition/ If the method suggested in that article is used to compare round(x/y) to x/y, it should produce sensible results. But is that something that JSON Schema should require of validators? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#312 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAUIXo2ee8n_dtF0qJpFo8X8Dffd7DA9ks5r41p3gaJpZM4NIvyD> .

cederlys · 2017-05-14T19:35:05Z

I have two issues with "expect the behavior defined by the float spec":

As far as I know, the IEEE spec does not define a multipleOf operation. That operation is only defined in the JSON Schema specification, and it is not done by operations from the IEEE spec.
JSON does not require that an implementation use binary floating point. It could use decimal floating point (as COBOL does), it could use rational numbers (as perl 6 does). See http://blogs.perl.org/users/ovid/2015/02/a-little-thing-to-love-about-perl-6-and-cobol.html

handrews · 2017-10-25T19:55:31Z

I think that any attempt to control the interpretation of numbers beyond what is specified in the JSON RFC (and standards that it references such as IEEE floats) should be done by defining values for format.

A format could be applied to numbers if the desire is simply to convey semantics (use decimal floating point vs use IEEE floating point). If the intention is to preserve some aspect of the numeric representation, because of the data model this is better done by defining a string format that indicates how the string should be interpreted as a number. This is because strings map fairly directly into the data model (particularly for things like basic numeric notation that do not require escaped characters), while numbers intentionally lose representation details during parsing.

See PR #455 (numeric representation and the data model), and issues json-schema-org/json-schema-vocabularies#45 (encoding decimals as strings), #152 (specifying precision), and #116 (format maximum/minimum, also discusses multipleOf for format) for related discussions.

Is there anything to be done for this issue that is not addressed by the other issues and PRs? If there are no comments indicating a course of action here after a couple of weeks I will close this in favor of the other issues.

I do not think that the JSON Schema core specification should mandate specific floating point behavior any more than JSON does.

handrews · 2020-02-28T20:41:36Z

It's been more than two years since I asked if there was anything not covered by the linked issues/prs, so I'm closing this.

Relequestual added the Priority: Medium label Apr 26, 2017

Julian mentioned this issue Aug 17, 2017

multipleOf - IEEE-754 floating point rounding error json-schema-org/JSON-Schema-Test-Suite#193

Closed

faassen mentioned this issue Sep 18, 2022

better support for decimals encoded as strings json-schema-org/json-schema-vocabularies#45

Open

handrews added the Status: Available label Sep 12, 2017

handrews added the validation label Sep 28, 2017

handrews added this to the draft-future milestone Oct 18, 2017

handrews added the format label Oct 25, 2017

awwright self-assigned this Dec 13, 2017

handrews mentioned this issue Dec 20, 2017

fix float64 precision division inaccuracy xeipuuv/gojsonschema#150

Closed

handrews mentioned this issue Mar 8, 2018

Vocabularies and "format" #563

Closed

handrews closed this as completed Feb 28, 2020

pvdbosch mentioned this issue Apr 28, 2021

international money amount and currency belgif/openapi-money#2

Closed

pvdbosch mentioned this issue Aug 4, 2021

representation of decimal numbers belgif/rest-guide#79

Closed

vearutop mentioned this issue Aug 16, 2021

multipleOf not always behaving correctly swaggest/php-json-schema#127

Closed

tdamsma mentioned this issue Nov 8, 2021

More lenient float multipleOf validation python-jsonschema/jsonschema#878

Closed

tatomyr mentioned this issue Jul 7, 2022

Linter problem with multipleOf (no-invalid-schema-examples rule) Redocly/redocly-cli#751

Open

xjamundx mentioned this issue Aug 15, 2023

multipleOf running into floating point issue sagold/json-schema-library#43

Closed

ElectricNroff mentioned this issue Mar 14, 2024

validation fails because of rounding in anyOf/baseScore/baseSeverity CVEProject/cve-services#1204

Closed

corentin-regent mentioned this issue May 8, 2024

Support numeric constraints for Decimal values jcrist/msgspec#683

Open

gregsdennis added this to Proposal: `format` update Jul 17, 2024

gregsdennis moved this to Closed in Proposal: `format` update Jul 17, 2024

gregsdennis moved this from Closed to Merged in Proposal: `format` update Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multipleOf and floating point rounding errors #312

multipleOf and floating point rounding errors #312

cederlys commented Apr 26, 2017

Relequestual commented Apr 26, 2017

cederlys commented Apr 26, 2017

Julian commented Apr 26, 2017

Relequestual commented Apr 26, 2017

cederlys commented Apr 26, 2017

Julian commented Apr 26, 2017

cederlys commented Apr 27, 2017

Julian commented Apr 27, 2017

cederlys commented Apr 28, 2017

awwright commented Apr 29, 2017 •

edited

Loading

cederlys commented May 11, 2017

Julian commented May 11, 2017 via email

cederlys commented May 14, 2017

handrews commented Oct 25, 2017

handrews commented Feb 28, 2020

multipleOf and floating point rounding errors #312

multipleOf and floating point rounding errors #312

Comments

cederlys commented Apr 26, 2017

Relequestual commented Apr 26, 2017

cederlys commented Apr 26, 2017

Julian commented Apr 26, 2017

Relequestual commented Apr 26, 2017

cederlys commented Apr 26, 2017

Julian commented Apr 26, 2017

cederlys commented Apr 27, 2017

Julian commented Apr 27, 2017

cederlys commented Apr 28, 2017

awwright commented Apr 29, 2017 • edited Loading

cederlys commented May 11, 2017

Julian commented May 11, 2017 via email

cederlys commented May 14, 2017

handrews commented Oct 25, 2017

handrews commented Feb 28, 2020

awwright commented Apr 29, 2017 •

edited

Loading