Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for multipleOf overflow #438

Merged
merged 1 commit into from
Oct 4, 2020

Conversation

Zac-HD
Copy link
Contributor

@Zac-HD Zac-HD commented Oct 3, 2020

The spec says that instances are valid if "dividing by this value results in an integer".

This allows integers to be invalid for multipleOf: 0.5 if float division overflows to infinity (a non-integer); alternatively implementations may choose to implement logic which defines all integers as multiples of 0.5. Either way, however, implementations must not raise an error due to the overflow of a legal value against a legal schema.

Reference: python-jsonschema/jsonschema#743, cc @Julian

The spec says that instances are valid if "dividing by this value results in an integer".

This allows integers to be invalid for `multipleOf: 0.5` if float division overflows to infinity (a non-integer); alternatively implementations may choose to implement logic which defines all integers as multiples of 0.5.  Either way, however, implementations must not raise an error due to the overflow of a legal value against a legal schema.
@Zac-HD Zac-HD force-pushed the overflow-with-multipleOf branch from a2a52b1 to c5ba4ba Compare October 3, 2020 04:11
@Julian
Copy link
Member

Julian commented Oct 4, 2020

Cool. Think these lgtm, thanks!

(To add some extra color for folks -- the added required test seems consistent with the spec as well -- implementations with only IEEE floats will get inf, and so can indeed distinguish the overflow).

@Julian Julian merged commit 96742ba into json-schema-org:master Oct 4, 2020
@Zac-HD Zac-HD deleted the overflow-with-multipleOf branch October 4, 2020 23:26
@karenetheridge
Copy link
Member

implementations may choose to implement logic which defines all integers as multiples of 0.5

I'm curious as to how you might expect to see this in code. Would 0.5 be special-cased somehow?

@Julian
Copy link
Member

Julian commented Oct 16, 2020

Probably more realistically by doing something like what we ended up doing (falling back to non-float division on overflow). Which yeah isn't required, but the non-erroring is.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Oct 16, 2020

That's the general solution - you could also check whether 1/multipleOf is an integer, and if so drop it from integer schemas.

@karenetheridge
Copy link
Member

karenetheridge commented Oct 22, 2020

This is not a valid test on systems built with longdoubles (where 16 bytes are allocated to floats) -- as the inputs here are not big enough to cause an overflow. You need numbers way way larger to be sure to tickle an overflow (on my system, 'Inf' is the result in that case), and then you'll clash with the optional/bignum.json tests which suggest that implementations should also be able to handle very large numbers, which means that this edge case will be even harder to reproduce.

(breadcrumb for self: an example of a failure on a smoker system is http://www.cpantesters.org/cpan/report/ee0e51be-1384-11eb-a9f9-f9eb91108de4 )

11:11 < ether> BinGOs: can you give me  perl -wle'print 1e308 / 0.123456789'   and perl -wle'print 1e308 / 0.5' also please?
11:12 < BinGOs> 8.10000007371000067e+308 and 2e+308 respectively

@Julian
Copy link
Member

Julian commented Oct 22, 2020

Are you referring to the optional ones or the required one?

The required one passes there even in your example because indeed the result is nondivisible right?

You're saying the optional test conflicts with the other optional test?

@karenetheridge
Copy link
Member

karenetheridge commented Oct 22, 2020

I'm referring to the required one. On a longdouble system, the operation does not overflow -- the quotient is 8.10000007371000067e+308 which is an integer, so the evaluation passes even though the test says it should fail.

@Julian
Copy link
Member

Julian commented Oct 22, 2020

Ah sorry I can't read... ok think I follow.

Probably we can then pick a value that doesn't produce an integer with that much precision either instead of the one we have, and then that should suit both possibilities?

@ssilverman
Copy link
Member

In this section: https://json-schema.org/draft/2019-09/json-schema-validation.html#rfc.section.4.2
it states, "numeric instances processed by JSON Schema can be arbitrarily large and/or have an arbitrarily long decimal part, regardless of the ability of the underlying programming language to deal with such data."

To me, this means that 1e308 is never a multiple of 0.123456789, despite the fact that there's loss of precision given some language's implementation.

Might it be useful to add a paragraph in some future draft dealing with "implementations that truncate", describing expected behaviour and limitations? Currently, the above spec reference doesn't mention any MUST or SHOULD words when talking about this.

@ssilverman
Copy link
Member

ssilverman commented Oct 22, 2020

One of the reasons the Unicode-based regex tests were moved to optional/ was because .NET doesn't support them (I disagreed with that). I'm not in favour of moving things to optional/ based on the fact that implementations don't support things.

Having said that, floating-point handling is probably even more fundamental than regexes, but still, for arbitrary-precision-capable implementations, these tests are still valid. I don't agree with optional or missing tests just to avoid specific implementation details. For arbitrary-precision-capable implementations, these tests shouldn't just be present, they should be non-optional.

Further to that, why isn't having a 1e308//0.123456789 test a good thing? If we're only testing values that don't overflow, how will perfectly valid arbitrary-precision implementations that support ranges beyond the overflow range of most byte-limited implementations get tested?

These are perfectly valid, spec-conforming, non-optional tests. Unless, is arbitrary-precision support optional? If not, these tests need to be non-optional.

@karenetheridge
Copy link
Member

Probably we can then pick a value that doesn't produce an integer with that much precision either instead of the one we have, and then that should suit both possibilities?

Possibly, but I don't have the time right now to compile a new interpreter with longdouble enabled, and then search for a suitable pair of numbers that do not divide evenly.. and then test them on systems with other float sizes to see that they divide as expected also.

@karenetheridge
Copy link
Member

Further to that, why isn't having a 1e308//0.123456789 test a good thing?

Testing how a JSON Schema implementation handles overflow is fine (it should return an invalid result, rather than exploding or returning true). The problem is creating a test case that is guaranteed to produce an overflow.

@Julian
Copy link
Member

Julian commented Oct 22, 2020

Can you maybe open an issue to start and we can figure out whatever the right solution should be? (Or propose one if you have one -- ideally though we should end up with something "universal" that won't be forgotten as a required test, but if we can't think of how to do so we can leave that as an open issue) -- but yeah open an issue for the test being unpassable if you have longer floats?

@Julian
Copy link
Member

Julian commented Oct 22, 2020

We may even be able to bribe Zac-HD :D -- maybe hypothesis could find us a pair of numbers that return non-integers for both 8 and 16 byte floats or something (and yeah sorry I'm still mid-$WORK so possibly my brain isn't thinking straight still on an obvious solution so feel free to suggest one if anyone has one)

@ssilverman
Copy link
Member

The problem is creating a test case that is guaranteed to produce an overflow.

What's wrong with 1e308//0.123456789? Maybe I'm missing something...

@Julian
Copy link
Member

Julian commented Oct 22, 2020

Unless, is arbitrary-precision support optional?

Arbitrary precision is optional yeah. You are allowed to use IEEE floats or doubles (and presumably also this -- 16 byte floats)

@Zac-HD

This comment has been minimized.

@Zac-HD
Copy link
Contributor Author

Zac-HD commented Oct 24, 2020

The problem: this test passes (i.e. overflows to non-integer value infinity) at 64bit precision, and passes (i.e. gives a non-integer-valued finite result) at arbitrary precision. I also get np.longdouble("1e308") / np.longdouble("0.123456789") == np.inf, but on my system I'm pretty sure that's only 80-bit precision.

So the problem appears to be that rounding specifically in 128-bit floats leads to an integer output from this test case.

I think we can solve this just by making the quotient e.g. 1e308 / 0.12345678901234567890123456789 which works just fine at 64bit float and arbitrary precision. @karenetheridge, can you check this with 128-bit floats for me and I'll open a follow-up PR?

@karenetheridge
Copy link
Member

karenetheridge commented Oct 27, 2020

That calculation doesn't overflow, but it still looks like an even multipleOf (the quotient is an integer):

$ perl -wle'my $quotient = 1e308 / 0.12345678901234567890123456789; print $quotient; print $quotient == int($quotient) ? "true" : "false"'
8.10000007290000066e+308
true

$ perl -V
Summary of my perl5 (revision 5 version 32 subversion 0) configuration:
   
  Platform:
    osname=darwin
    osvers=17.7.0
    archname=darwin-ld-2level
    uname='darwin aquavit 17.7.0 darwin kernel version 17.7.0: mon aug 31 22:11:23 pdt 2020; root:xnu-4570.71.82.6~1release_x86_64 x86_64 '
    config_args='-de -Dprefix=/Users/ether/perl5/perlbrew/perls/32.0_longdouble -Dman1dir=none -Dman3dir=none -Duselongdouble -Aeval:scriptdir=/Users/ether/perl5/perlbrew/perls/32.0_longdouble/bin'
    hint=recommended
    useposix=true
    d_sigaction=define
    useithreads=undef
    usemultiplicity=undef
    use64bitint=define
    use64bitall=define
    uselongdouble=define
    usemymalloc=n
    default_inc_excludes_dot=define
    bincompat5005=undef
  Compiler:
    cc='cc'
    ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.13 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -I/opt/local/include -DPERL_USE_SAFE_PUTENV'
    optimize='-O3'
    cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=10.13 -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -I/opt/local/include'
    ccversion=''
    gccversion='4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)'
    gccosandvers=''
    intsize=4
    longsize=8
    ptrsize=8
    doublesize=8
    byteorder=12345678
    doublekind=3
    d_longlong=define
    longlongsize=8
    d_longdbl=define
    longdblsize=16
    longdblkind=3
    ivtype='long'
    ivsize=8
    nvtype='long double'
    nvsize=16
    Off_t='off_t'
    lseeksize=8
    alignbytes=16
    prototype=define
  Linker and Libraries:
    ld='cc'
    ldflags =' -mmacosx-version-min=10.13 -fstack-protector-strong -L/usr/local/lib -L/opt/local/lib'
    libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.0/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /usr/lib /opt/local/lib
    libs=-lpthread -lgdbm -ldbm -ldl -lm -lutil -lc
    perllibs=-lpthread -ldl -lm -lutil -lc
    libc=
    so=dylib
    useshrplib=false
    libperl=libperl.a
    gnulibc_version=''
  Dynamic Linking:
    dlsrc=dl_dlopen.xs
    dlext=bundle
    d_dlsymun=undef
    ccdlflags=' '
    cccdlflags=' '
    lddlflags=' -mmacosx-version-min=10.13 -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector-strong'


Characteristics of this binary (from libperl): 
  Compile-time options:
    HAS_TIMES
    PERLIO_LAYERS
    PERL_COPY_ON_WRITE
    PERL_DONT_CREATE_GVSV
    PERL_MALLOC_WRAP
    PERL_OP_PARENT
    PERL_PRESERVE_IVUV
    PERL_USE_SAFE_PUTENV
    USE_64_BIT_ALL
    USE_64_BIT_INT
    USE_LARGE_FILES
    USE_LOCALE
    USE_LOCALE_COLLATE
    USE_LOCALE_CTYPE
    USE_LOCALE_NUMERIC
    USE_LOCALE_TIME
    USE_LONG_DOUBLE
    USE_PERLIO
    USE_PERL_ATOF
  Built under darwin
  Compiled at Oct 27 2020 09:16:39
  %ENV:
    PERLBREW_HOME="/Users/ether/.perlbrew"
    PERLBREW_MANPATH="/Users/ether/perl5/perlbrew/perls/32.0_longdouble/man"
    PERLBREW_PATH="/Users/ether/perl5/perlbrew/bin:/Users/ether/perl5/perlbrew/perls/32.0_longdouble/bin"
    PERLBREW_PERL="32.0_longdouble"
    PERLBREW_ROOT="/Users/ether/perl5/perlbrew"
    PERLBREW_SHELLRC_VERSION="0.89"
    PERLBREW_VERSION="0.89"
    PERLDOC="-oman"
    PERLDOC_PAGER="less -sicMr"
    PERL_AUTOINSTALL_PREFER_CPAN="1"
    PERL_INSTALL_QUIET="1"
    PERL_USE_UNSAFE_INC="0"
  @INC:
    /Users/ether/perl5/perlbrew/perls/32.0_longdouble/lib/site_perl/5.32.0/darwin-ld-2level
    /Users/ether/perl5/perlbrew/perls/32.0_longdouble/lib/site_perl/5.32.0
    /Users/ether/perl5/perlbrew/perls/32.0_longdouble/lib/5.32.0/darwin-ld-2level
    /Users/ether/perl5/perlbrew/perls/32.0_longdouble/lib/5.32.0

@karenetheridge
Copy link
Member

karenetheridge commented Oct 29, 2020

You can see the same problem with a simpler example:

$ perl -wle'my $quotient = 1e308 / 3; print $quotient; print $quotient == int($quotient) ? "true" : "false"'
3.33333333333333333e+307
true

The divisor is so large that even though the quotient is not an integer, it appears as if it an integer because of how floats are expressed internally. I do not see any way of correctly checking the integer status here (and the same situation will occur with smaller divisors on architectures with smaller floats).

e.g. on a 64-bit float implementation:

$ perl -wle'my $quotient = 1e10 / 3; print $quotient; print $quotient == int($quotient) ? "true" : "false"'
3333333333.33333
false

$ perl -wle'my $quotient = 1e100 / 3; print $quotient; print $quotient == int($quotient) ? "true" : "false"'
3.33333333333333e+99
true

edit, 2021-12-30: I was wrong about most of what I said here :) see #534 for a more current discussion.

@karenetheridge
Copy link
Member

This is not a valid test on systems built with longdoubles (where 16 bytes are allocated to floats) -- as the inputs here are not big enough to cause an overflow.

The test being referred to here is multipleOf.json - "invalid instance should not raise error when float division = inf" - "always invalid, but naive implementations may raise an overflow error". It also fails when 12 bytes are allocated to floats.

http://www.cpantesters.org/cpan/report/d1a4dd18-ca7c-11eb-932e-240466f710c6
http://www.cpantesters.org/cpan/report/2464b1aa-ca7c-11eb-8dc8-abf665f710c6
(on these test systems, doublekind=3, longdblkind=3)

davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request Oct 21, 2021
Fixes overflow error exposed by: json-schema-org/JSON-Schema-Test-Suite#438

```
FloatDomainError: Infinity
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:367:in `floor'
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:367:in `validate_numeric'
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:378:in `validate_number'
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:277:in `validate_type'
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:198:in `validate_instance'
```
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request Apr 1, 2022
Floats aren't accurate enough here.

Fixes: #100

Also addresses overflow exposed by: json-schema-org/JSON-Schema-Test-Suite#438

```
FloatDomainError: Infinity
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:367:in `floor'
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:367:in `validate_numeric'
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:378:in `validate_number'
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:277:in `validate_type'
    /Users/dharsha/repos/json_schemer/lib/json_schemer/schema/base.rb:198:in `validate_instance'
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants