-
-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Add more information to JsonSchemaException #37
Comments
Good idea! Probably when I will be doing #36, I will do this one with it. :-) |
I've been noodling on how to do this, as i've had to do some extensive customization on other jsonschema libraries to get human readable (aka end user) translation. i think some of this is application specific, but definitely some additional information in the exception would be helpful. Unfortunately for true customization towards human readable errors (ie the app specific kind), the only way i can see in the existing setup to get there is to catch the exception and use schema knowledge in conjunction with stack walking to get to something. ie typically i use absolute path (array) from the top of the document to the item that error'd to get to the most user usable context. There's some consideration here of how the library validates that also causes concern, because it effectively does a for loop with try/except over methods, we end up losing the specific error matching error, for a more generic one from the enclosing block, i'm not sure how to resolve that. |
thought about it a bit more, two different facilities i think are needed, one is adding a context object to the downward call tree, or some mechanism of capturing the hierarchy for leaf callers to use when raising an exception. the other is at a given stack/level evaluation capturing the current for loop/if/else chain errors in a transient var thats also captured in a fall off the end of the list/chain exception. for best guess on error (app independent) its probably a view to the deepest chained exception, and for app specific this carries enough context in the exception to avoid the stack walking/foraging in an except handler. Curious for feedback on this or other possible mechanisms. fwiw, here's an example i do with another lib to enrich with semantics for end user, its fairly tightly coupled, but still indicative of determining best human message with additional schema knowledge across a set of leaf node errors. https://github.com/cloud-custodian/cloud-custodian/blob/master/c7n/schema.py#L93 |
I don't follow your thoughts much. Anyway, my idea is to simply add some context to the exception so you can do something reasonable with it. Something like: try:
validate(data)
except JsonSchemaException as exc:
assert exc.rule == "maximum"
assert exc.rule_definition == 24
assert exc.path == "property.subproperty[2]"
# or maybe ["property", "subproperty", "2"], maybe both
assert exc.value == 42
assert exc.message == "property.subproperty[2] must be smaller than 24" |
i think adding in the additional path information would be useful, but i don't think it really addresses some of the use cases. at the moment fast jsonschema when it hits an anyOf, allOf, or oneOf will effectively drop leaf exceptions and generate a new one at the container schema element. i'm generally working with large schemas (1.2mb) and small documents (1-2k), the schema has a lot of elements under am anyOf block, getting a useful error message to a human requires walking/evaluating the leaf error messages (by schema tree leaf depth) against the schema to give something for a human to consume that they can correct their data. so coming up with a simple example https://gist.github.com/kapilt/0fe1116df1cec10921d23559de5cae7d the default error raised by fastjsonschema on trunk against this sample data is simply
the additional attributes that were added to the exception, unfortunately aren't useful in presenting to the user why specifically there was a failure because it lacks the context of any the previously caught exceptions. (Pdb) dir(e)
['__cause__', '__class__', '__context__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__suppress_context__', '__traceback__', '__weakref__', 'args', 'definition', 'message', 'name', 'path', 'rule', 'rule_definition', 'value', 'with_traceback']
(Pdb) e.rule
'anyOf'
(Pdb) e.path
['data', 'policies', '0']
(Pdb) e.message
'data.policies[0] must be valid by one of anyOf definition'
(Pdb) e.value
{'name': 'team-tag-ebs-snapshot-audit', 'resource': 'ebs-snapshot', 'description': 'Cloud Custodian EBS Snapshot Team Tag Audit\n', 'comments': 'Copy EBS volume tag with key "Team" to EBS snapshots\n', 'filters': [{'or': [{'tag:Team': 'empty'}, {'tag:Team': 'absent'}]}], 'actions': [{'type': 'copy-related-tag', 'resource': 'ebs', 'skip_missing': True, 'key': 'VolumeId', 'tags': 'Team'}]} while the default jsonschema error output is effectively similiar, the exception has enough context information from the leaf node exceptions that we're able to produce a relevant human semantic error message of
|
@kapilt What about properties |
rule and rule_definition are helpful for leaf schema elements to capture why. for container schema elements, their not particularly helpful. ie in the example code above, looking at rule definition simply provides a list of all underlying schema container elements without providing any context of which one failed to validate. of course of all of them did, but understanding which one of the child elements got closest to matching is key to presenting something useful to the user.
rule_definition is actually slightly problematic for other reasons on large container elements as the schema gets duplicated several times via the exception blocks. for our schema (1.2mb) the generated code size when from 4mb to 8mb with this change. I think it would be better to store the schema once in the generated code and use an expression (jmespath/jsonpath) pointer to the data structure, similar to what we do with the data being validated (ie e.path), else we have significant data duplication in the generated code. I've explored modifying the generator on container schema elements (oneOf, anyOf, allOf) to collect leaf exceptions and propagate them in container element exceptions, which allows for a depth search to the most meaningful exception node. The project I'm working on is open source (GitHub.com/cloud-custodian/cloud-custodian) and this is prelim exploration as we're waiting till we can drop 2.7 support at end of year, till we can switch to fastjsonschema, but coming up with a way to provide good error messages to users is a key consideration. $ git diff
diff --git a/fastjsonschema/draft04.py b/fastjsonschema/draft04.py
index 824a870..f7dc50b 100644
--- a/fastjsonschema/draft04.py
+++ b/fastjsonschema/draft04.py
@@ -143,16 +143,17 @@ class CodeGeneratorDraft04(CodeGenerator):
Valid values for this definition are 3, 4, 5, 10, 11, ... but not 8 for example.
"""
self.l('{variable}_any_of_count = 0')
+ self.l('{variable}_leaf_exceptions = []')
for definition_item in self._definition['anyOf']:
# When we know it's passing (at least once), we do not need to do another expensive try-except.
with self.l('if not {variable}_any_of_count:', optimize=False):
with self.l('try:'):
self.generate_func_code_block(definition_item, self._variable, self._variable_name, clear_variables=True)
self.l('{variable}_any_of_count += 1')
- self.l('except JsonSchemaException: pass')
+ self.l('except JsonSchemaException as e: {variable}_leaf_exceptions.append(e)')
with self.l('if not {variable}_any_of_count:', optimize=False):
- self.exc('{name} must be valid by one of anyOf definition', rule='anyOf')
+ self.exc('{name} must be valid by one of anyOf definition', rule='anyOf', exceptions='%s_leaf_exceptions' % self._variable)
def generate_one_of(self):
"""
diff --git a/fastjsonschema/exceptions.py b/fastjsonschema/exceptions.py
index b555356..200a4cc 100644
--- a/fastjsonschema/exceptions.py
+++ b/fastjsonschema/exceptions.py
@@ -20,13 +20,14 @@ class JsonSchemaException(ValueError):
Added all extra properties.
"""
- def __init__(self, message, value=None, name=None, definition=None, rule=None):
+ def __init__(self, message, value=None, name=None, definition=None, rule=None, exceptions=()):
super().__init__(message)
self.message = message
self.value = value
self.name = name
self.definition = definition
self.rule = rule
+ self.exceptions = exceptions
@property
def path(self):
diff --git a/fastjsonschema/generator.py b/fastjsonschema/generator.py
index c5916c5..f47d526 100644
--- a/fastjsonschema/generator.py
+++ b/fastjsonschema/generator.py
@@ -239,11 +239,11 @@ class CodeGenerator:
"""
return str(string).replace('"', '\\"')
- def exc(self, msg, *args, rule=None):
+ def exc(self, msg, *args, rule=None, exceptions=None):
"""
"""
- msg = 'raise JsonSchemaException("'+msg+'", value={variable}, name="{name}", definition={definition}, rule={rule})'
- self.l(msg, *args, definition=repr(self._definition), rule=repr(rule))
+ msg = 'raise JsonSchemaException("'+msg+'", value={variable}, name="{name}", definition={definition}, rule={rule}, exceptions={exceptions})'
+ self.l(msg, *args, definition=repr(self._definition), rule=repr(rule), exceptions=exceptions)
def create_variable_with_length(self):
""" |
Thanks for the tip. Something have to be definitely done, there is another issue #72. Unfortunately I cannot guarantee I will make it done till end of the year. |
It would be useful (I would say even necessary) to put more information in JsonSchemaException. Message generated by the library is rarely fit to be displayed to the end user, and there is also a matter or i18n. It should be possible to identify the part of the document that was deemed invalid, corresponding part of the schema and the name of validation rule which raised the exception.
The text was updated successfully, but these errors were encountered: