Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible rounding issue causing multipleOf to fail for a valid value. #1039

Closed
mayafox opened this issue Jan 26, 2023 · 20 comments
Closed

Possible rounding issue causing multipleOf to fail for a valid value. #1039

mayafox opened this issue Jan 26, 2023 · 20 comments

Comments

@mayafox
Copy link

mayafox commented Jan 26, 2023

Encountered the following error when fixing the schema validation errors for a new device I was adding into netbox/devicetype-library

Failed validating 'multipleOf' in schema['properties']['weight']['items']['properties']['value']:
{'minimum': 0, 'multipleOf': 0.01, 'type': 'number'}

On instance['weight'][0]['value']:
10.11

Did a bit of poking around, changing the value to 1011 <<< integer, and not a float, caused the check to pass. This seems to potentially be an issue in jsonschema,

https://github.com/python-jsonschema/jsonschema/blob/main/jsonschema/_validators.py line 181

If I am reading the code right, jsonschema is running into a rounding error when running this validation.

db = 0.01
instance = 10.11
isinstance(db, float)
True
quotient = instance / db
quotient
1010.9999999999999
int(quotient)
1010
int(round(quotient, 2))
1011

Any Thoughts, additional information I should collect?

@mayafox
Copy link
Author

mayafox commented Jan 26, 2023

possible fix

def multipleOf(validator, dB, instance, schema):
    if not validator.is_type(instance, "number"):
        return

    if isinstance(dB, float):
         precision = len(str(dB).split('.')[1])
        quotient = instance / dB
        try:
            failed = int(round(quotient, precision)) != quotient
        except OverflowError:
            # When `instance` is large and `dB` is less than one,
            # quotient can overflow to infinity; and then casting to int
            # raises an error.
            #
            # In this case we fall back to Fraction logic, which is
            # exact and cannot overflow.  The performance is also
            # acceptable: we try the fast all-float option first, and
            # we know that fraction(dB) can have at most a few hundred
            # digits in each part.  The worst-case slowdown is therefore
            # for already-slow enormous integers or Decimals.
            failed = (Fraction(instance) / Fraction(dB)).denominator != 1
    else:
        failed = instance % dB

    if failed:
        yield ValidationError(f"{instance!r} is not a multiple of {dB}")

@mayafox
Copy link
Author

mayafox commented Jan 27, 2023

Issue was resolved by using a different validation method, closing

@mayafox mayafox closed this as completed Jan 27, 2023
@cwegener
Copy link

@mayafox I don't think that the issue is resolved by avoiding the bug. Would you be able to re-open the issue?

@mayafox mayafox reopened this Jan 30, 2023
@cwegener
Copy link

It seems that the distinction between validation of floating point numbers and validation of arbitrary decimal precision is causing frequent confusion for users of jsonschema (see large number of issues returned when searching for mulitpleOf in this GH repo.

  • I wonder if there is a better way to improve the developer experience specific to this issue?

  • Is there documentation about the topic that would help developers who are new to this topic?

  • How exactly did you resolve the issue yourself @mayafox ? (Did you change the definition of you JSON schema?)

@mayafox
Copy link
Author

mayafox commented Jan 30, 2023

@cwegener I hit this issue while contributing to netbox-community/devicetype-library#1067 and the schema validation failing. Devs for that repo changed the validation method to multipleOfPrecision, and addressed the issue.

@cwegener
Copy link

@cwegener I hit this issue while contributing to netbox-community/devicetype-library#1067 and the schema validation failing. Devs for that repo changed the validation method to multipleOfPrecision, and addressed the issue.

Thanks for the additional info. I don't actually think that the netbox-community project even resolved the issue.
They also just seem to avoid the issue by (unknowingly) disabling the data validation of the field in question.

The apparent "fix" is not a fix at all. netbox-community/devicetype-library#1068

I've left a comment in your PR thread in the netbox-community org netbox-community/devicetype-library#1067 (comment)

@mayafox
Copy link
Author

mayafox commented Jan 30, 2023

@cwegener Ok, what's the plan of action then? I thought about the code snippet I proposed, and it's nowhere near the quality of what I would accept in my day job, but is potentially a starting point.

@Julian
Copy link
Member

Julian commented Jan 30, 2023

Hi there -- please have a look at some of the older issues you can find by searching multipleOf -- essentially automatically rounding isn't the right thing to do for the same reason Python doesn't try to guess what you mean by doing 0.3 / 0.1 either -- that's just the way floats work. If you want to though you can use decimal.Decimal for fixed precision operations.

@Julian Julian closed this as completed Jan 30, 2023
@danner26
Copy link

danner26 commented Jan 30, 2023

Thank you for the heads up @cwegener do you have any recommendations on how to resolve this properly then? In my searches I have not found a well defined solution to this use case, but I did find a lot of other people suffering from the same confusion

@Julian can you describe how to use this? I was checking here, and I do not see decimal defined. I could be missing it though. I had noticed someone else mention decimal.Decimal in a stackoverflow artcle but could not find an implementation example. When I change the validation type to decimal.Decimal I get:

venv/lib/python3.10/site-packages/jsonschema/validators.py:318: in is_type
    return self.TYPE_CHECKER.is_type(instance, type)
venv/lib/python3.10/site-packages/jsonschema/_types.py:119: in is_type
    raise UndefinedTypeCheck(type) from None
E   jsonschema.exceptions.UndefinedTypeCheck: Type 'decimal.Decimal' is unknown to this type checker

During handling of the above exception, another exception occurred:
tests/definitions_test.py:83: in test_definitions
    Draft4Validator(schema, resolver=resolver).validate(definition)
venv/lib/python3.10/site-packages/jsonschema/validators.py:313: in validate
    for error in self.iter_errors(*args, **kwargs):
venv/lib/python3.10/site-packages/jsonschema/validators.py:288: in iter_errors
    for error in errors:
venv/lib/python3.10/site-packages/jsonschema/_validators.py:332: in properties
    yield from validator.descend(
venv/lib/python3.10/site-packages/jsonschema/validators.py:305: in descend
    for error in self.evolve(schema=schema).iter_errors(instance):
venv/lib/python3.10/site-packages/jsonschema/validators.py:288: in iter_errors
    for error in errors:
venv/lib/python3.10/site-packages/jsonschema/_legacy_validators.py:107: in items_draft3_draft4
    yield from validator.descend(item, items, path=index)
venv/lib/python3.10/site-packages/jsonschema/validators.py:305: in descend
    for error in self.evolve(schema=schema).iter_errors(instance):
venv/lib/python3.10/site-packages/jsonschema/validators.py:288: in iter_errors
    for error in errors:
venv/lib/python3.10/site-packages/jsonschema/_validators.py:298: in ref
    yield from validator.descend(instance, resolved)
venv/lib/python3.10/site-packages/jsonschema/validators.py:305: in descend
    for error in self.evolve(schema=schema).iter_errors(instance):
venv/lib/python3.10/site-packages/jsonschema/validators.py:288: in iter_errors
    for error in errors:
venv/lib/python3.10/site-packages/jsonschema/_validators.py:332: in properties
    yield from validator.descend(
venv/lib/python3.10/site-packages/jsonschema/validators.py:305: in descend
    for error in self.evolve(schema=schema).iter_errors(instance):
venv/lib/python3.10/site-packages/jsonschema/validators.py:288: in iter_errors
    for error in errors:
venv/lib/python3.10/site-packages/jsonschema/_validators.py:321: in type
    if not any(validator.is_type(instance, type) for type in types):
venv/lib/python3.10/site-packages/jsonschema/_validators.py:321: in <genexpr>
    if not any(validator.is_type(instance, type) for type in types):
venv/lib/python3.10/site-packages/jsonschema/validators.py:320: in is_type
    raise exceptions.UnknownType(type, instance, self.schema)
E   jsonschema.exceptions.UnknownType: Unknown type 'decimal.Decimal' for validator with schema:
E       {'type': 'decimal.Decimal'}
E   
E   While checking instance:
E       10.11

@Julian
Copy link
Member

Julian commented Jan 30, 2023

Do the docs here happen to help? The example there is for number, but the example is essentially the same for changing what number means instead -- if not happy to improve them (or even better to type out an example and hope for a PR :)!)

@Julian
Copy link
Member

Julian commented Jan 30, 2023

Or I guess it sounds like you want info on how to do this with the json module, in which case you want the parse_float option to json.loads e.g. json.loads(..., parse_float=decimal.Decimal).

@danner26
Copy link

@Julian yes, we are using the json module in our tests along with jsonschema. I havent worked with jsonschema before this, so thank you for the assistance

I seem to still be having the same issues here. I changed the json.loads to json.loads(schema_file.read(), parse_float=decimal.Decimal)
I have also updated the definition to:

"value": {
                    "type": "number",
                    "minimum": 0,
                    "multipleOf": 0.01
                },

Using a value of 10.11 I am still getting:

10.11 is not a multiple of 0.01

Failed validating 'multipleOf' in schema['properties']['weight']['items']['properties']['value']:
    {'minimum': 0, 'multipleOf': 0.01, 'type': 'number'}

On instance['weight'][0]['value']:
    10.11

During handling of the above exception, another exception occurred:
device-types/3Com/4200G-12_Port.yml failed validation: 10.11 is not a multiple of 0.01

Failed validating 'multipleOf' in schema['properties']['weight']['items']['properties']['value']:
    {'minimum': 0, 'multipleOf': 0.01, 'type': 'number'}

On instance['weight'][0]['value']:
    10.11

Do I need to also create the custom validator as well? I am sorry if this is a simple issue that I am overlooking, I appreciate the input.

@Julian
Copy link
Member

Julian commented Jan 30, 2023

That should be it -- the exception there might make it seem you may not have saved perhaps? Or not have loaded both floats as decimals?

But e.g. here's what should be a fuller example:

import decimal
import json
schema = json.loads("""{"type": "number", "minimum": 0, "multipleOf": 0.01}""", parse_float=decimal.Decimal)
good = json.loads("10.11", parse_float=decimal.Decimal)
bad = json.loads("10.115", parse_float=decimal.Decimal)
jsonschema.validate(good, schema)  # fine

vs

jsonschema.validate(bad, schema)

->

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/julian/Development/jsonschema/jsonschema/validators.py", line 1121, in validate
    raise error
jsonschema.exceptions.ValidationError: Decimal('10.115') is not a multiple of 0.01

Failed validating 'multipleOf' in schema:
    {'minimum': 0, 'multipleOf': Decimal('0.01'), 'type': 'number'}

On instance:
    Decimal('10.115')

Decimal('10.115') is not a multiple of 0.01

Failed validating 'multipleOf' in schema:
    {'minimum': 0, 'multipleOf': Decimal('0.01'), 'type': 'number'}

On instance:
    Decimal('10.115')

@danner26
Copy link

Still getting the same thing, I will have to dig into this further. Thanks for all of your help so far

@danner26
Copy link

danner26 commented Jan 30, 2023

So @Julian when checking my RefResolver schema (resolver = RefResolver(f'file://{os.getcwd()}/schema/devicetype.json', schema)), the part in question is 'weight': {'type': 'array', 'items': {'$ref': 'components.json#/definitions/weight'} which evaluated to:

"weight": {
            "type": "object",
            "properties": {
                "value": {
                    "type": "number",
                    "minimum": 0,
                    "multipleOf": 0.01
                },
                "unit": {
                    "type": "string",
                    "enum": [
                        "kg",
                        "g",
                        "lb",
                        "oz"
                    ]
                }
            },
            "required": ["value", "unit"]
        }

Then it fails at Draft4Validator(schema, resolver=resolver).validate(definition) where the definition for the part in question evaluates to 'weight': [{'value': 10.11, 'unit': 'kg'}]

Any ideas what I might have wrong here? Is it because we are using Draft4Validator?

@danner26
Copy link

I am also seeing these errors now:

tests/definitions_test.py:84: in test_definitions
    Draft4Validator(schema, resolver=resolver).validate(definition)
/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/jsonschema/validators.py:313: in validate
    for error in self.iter_errors(*args, **kwargs):
/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/jsonschema/validators.py:288: in iter_errors
    for error in errors:
/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/jsonschema/_validators.py:332: in properties
    yield from validator.descend(
/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/jsonschema/validators.py:305: in descend
    for error in self.evolve(schema=schema).iter_errors(instance):
/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/jsonschema/validators.py:288: in iter_errors
    for error in errors:
/opt/hostedtoolcache/Python/3.8.16/x64/lib/python3.8/site-packages/jsonschema/_validators.py:202: in multipleOf
    failed = instance % dB
E   TypeError: unsupported operand type(s) for %: 'float' and 'decimal.Decimal'

Which seems to lead back to

failed = instance % dB

Does this warrant a new issue?

@Julian
Copy link
Member

Julian commented Jan 31, 2023

That error means you've now properly loaded your schema with decimals, but not your instance (the thing you're validating). As I mentioned, you need to make sure both of them are decimals -- I'm not sure where you're getting your instance from though so I can't point out the error, but you need to load your instance JSON with the same parse_float option as I showed in the example.

@jsenecal
Copy link

That error means you've now properly loaded your schema with decimals, but not your instance (the thing you're validating). As I mentioned, you need to make sure both of them are decimals -- I'm not sure where you're getting your instance from though so I can't point out the error, but you need to load your instance JSON with the same parse_float option as I showed in the example.

@Julian So, turns out that loading the schema with parse_float doesn't help when loading references to other files. A custom handler needs to be defined. Please take a look at https://github.com/netbox-community/devicetype-library/pull/1073/files for my take on fixing this issue for us.

Maybe you could shine some light if I overengineered something there or maybe improve the docs based on our troubles :D

@Julian
Copy link
Member

Julian commented Jan 31, 2023

Ah I see, sorry, I had it reversed then, your instance was fine but not your schema -- I didn't realize you were saying you were loading the schema from behind a $ref -- what you have there looks more or less fine, you can do it slightly more easily by using store rather than handlers -- which basically is just a dictionary with schemas -- so something like RefResolver(..., store={"file://whatever/path/schema.json": json.loads(Path("whatever/path/schema.json").read_text(), parse_float=decimal.Decimal)) which just allows you to do the loading yourself (and pass that extra arg), but otherwise is essentially equivalent to what you have for your case.

And definitely the docs could use more examples here but sometime soon (maybe in the next month or so) there'll be an even easier way to do this once support for the new referencing library lands which essentially will supercede RefResolver, so lots will get easier then!

Feedback definitely still appreciated though!

@danner26
Copy link

Thanks for all of your help @Julian

danner26 added a commit to netbox-community/devicetype-library that referenced this issue Feb 2, 2023
* Removed mgmt_only: false, removed extra newlines

* Testing a fix for the multipleOf issue using info in python-jsonschema/jsonschema#1039

* Revert "Testing a fix for the multipleOf issue using info in python-jsonschema/jsonschema#1039"

This reverts commit ea7edcc.

* removing device-types/Dell/Unity_XT_DAE.yaml - there are multiple different models of these disk array enclosures. with the possible addition of storage management we need the actual part/models rather than a generalized item

* Renamed Dell PowerSwitches to have the proper name
Added data sheet/spec sheet links when available

* Added data sheet links to comment

* Update NX3240 to match proper name
Standardized slug

* Standardized names and slugs
Added spec sheets where available
mjpereira602 pushed a commit to mjpereira602/devicetype-library that referenced this issue Feb 22, 2023
* Removed mgmt_only: false, removed extra newlines

* Testing a fix for the multipleOf issue using info in python-jsonschema/jsonschema#1039

* Revert "Testing a fix for the multipleOf issue using info in python-jsonschema/jsonschema#1039"

This reverts commit ea7edcc.

* removing device-types/Dell/Unity_XT_DAE.yaml - there are multiple different models of these disk array enclosures. with the possible addition of storage management we need the actual part/models rather than a generalized item

* Renamed Dell PowerSwitches to have the proper name
Added data sheet/spec sheet links when available

* Added data sheet links to comment

* Update NX3240 to match proper name
Standardized slug

* Standardized names and slugs
Added spec sheets where available
mjpereira602 pushed a commit to mjpereira602/devicetype-library that referenced this issue Mar 1, 2023
* Removed mgmt_only: false, removed extra newlines

* Testing a fix for the multipleOf issue using info in python-jsonschema/jsonschema#1039

* Revert "Testing a fix for the multipleOf issue using info in python-jsonschema/jsonschema#1039"

This reverts commit ea7edcc.

* removing device-types/Dell/Unity_XT_DAE.yaml - there are multiple different models of these disk array enclosures. with the possible addition of storage management we need the actual part/models rather than a generalized item

* Renamed Dell PowerSwitches to have the proper name
Added data sheet/spec sheet links when available

* Added data sheet links to comment

* Update NX3240 to match proper name
Standardized slug

* Standardized names and slugs
Added spec sheets where available
etherwrangler pushed a commit to etherwrangler/devicetype-library that referenced this issue Jul 16, 2023
* Removed mgmt_only: false, removed extra newlines

* Testing a fix for the multipleOf issue using info in python-jsonschema/jsonschema#1039

* Revert "Testing a fix for the multipleOf issue using info in python-jsonschema/jsonschema#1039"

This reverts commit ea7edcc.

* removing device-types/Dell/Unity_XT_DAE.yaml - there are multiple different models of these disk array enclosures. with the possible addition of storage management we need the actual part/models rather than a generalized item

* Renamed Dell PowerSwitches to have the proper name
Added data sheet/spec sheet links when available

* Added data sheet links to comment

* Update NX3240 to match proper name
Standardized slug

* Standardized names and slugs
Added spec sheets where available
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants