Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes the currently failing json-schema-test-suite tests #370

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions json/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,7 @@ for more information.
### PostgreSQL ###

* [postgres-json-schema](https://github.com/gavinwahl/postgres-json-schema)
* [is_jsonb_valid](https://github.com/furstenheim/is_jsonb_valid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna just merge this immediately (the squash of the upstream changes)


If you use it as well, please fork and send a pull request adding yourself to
the list :).
Expand Down
77 changes: 76 additions & 1 deletion json/tests/draft4/optional/format.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,15 +25,80 @@
"schema": {"format": "uri"},
"tests": [
{
"description": "a valid URI",
"description": "a valid URL with anchor tag",
"data": "http://foo.bar/?baz=qux#quux",
"valid": true
},
{
"description": "a valid URL with anchor tag and parantheses",
"data": "http://foo.com/blah_(wikipedia)_blah#cite-1",
"valid": true
},
{
"description": "a valid URL with URL-encoded stuff",
"data": "http://foo.bar/?q=Test%20URL-encoded%20stuff",
"valid": true
},
{
"description": "a valid puny-coded URL ",
"data": "http://xn--nw2a.xn--j6w193g/",
"valid": true
},
{
"description": "a valid URL with many special characters",
"data": "http://-.~_!$&'()*+,;=:%40:80%2f::::::@example.com",
"valid": true
},
{
"description": "a valid URL based on IPv4",
"data": "http://223.255.255.254",
"valid": true
},
{
"description": "a valid URL with ftp scheme",
"data": "ftp://ftp.is.co.za/rfc/rfc1808.txt",
"valid": true
},
{
"description": "a valid URL for a simple text file",
"data": "http://www.ietf.org/rfc/rfc2396.txt",
"valid": true
},
{
"description": "a valid URL ",
"data": "ldap://[2001:db8::7]/c=GB?objectClass?one",
"valid": true
},
{
"description": "a valid mailto URI",
"data": "mailto:[email protected]",
"valid": true
},
{
"description": "a valid newsgroup URI",
"data": "news:comp.infosystems.www.servers.unix",
"valid": true
},
{
"description": "a valid tel URI",
"data": "tel:+1-816-555-1212",
"valid": true
},
{
"description": "a valid URN",
"data": "urn:oasis:names:specification:docbook:dtd:xml:4.1.2",
"valid": true
},
{
"description": "an invalid protocol-relative URI Reference",
"data": "//foo.bar/?baz=qux#quux",
"valid": false
},
{
"description": "an invalid relative URI Reference",
"data": "/abc",
"valid": false
},
{
"description": "an invalid URI",
"data": "\\\\WINDOWS\\fileshare",
Expand All @@ -43,6 +108,16 @@
"description": "an invalid URI though valid URI reference",
"data": "abc",
"valid": false
},
{
"description": "an invalid URI with spaces",
"data": "http:// shouldfail.com",
"valid": false
},
{
"description": "an invalid URI with spaces and missing scheme",
"data": ":// should fail",
"valid": false
}
]
},
Expand Down
21 changes: 21 additions & 0 deletions json/tests/draft6/const.json
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,27 @@
}
]
},
{
"description": "const with array",
"schema": {"const": [{ "foo": "bar" }]},
"tests": [
{
"description": "same array is valid",
"data": [{"foo": "bar"}],
"valid": true
},
{
"description": "another array item is invalid",
"data": [2],
"valid": false
},
{
"description": "array with additional items is invalid",
"data": [1, 2, 3],
"valid": false
}
]
},
{
"description": "const with null",
"schema": {"const": null},
Expand Down
72 changes: 71 additions & 1 deletion json/tests/draft6/optional/format.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,70 @@
"schema": {"format": "uri"},
"tests": [
{
"description": "a valid URI",
"description": "a valid URL with anchor tag",
"data": "http://foo.bar/?baz=qux#quux",
"valid": true
},
{
"description": "a valid URL with anchor tag and parantheses",
"data": "http://foo.com/blah_(wikipedia)_blah#cite-1",
"valid": true
},
{
"description": "a valid URL with URL-encoded stuff",
"data": "http://foo.bar/?q=Test%20URL-encoded%20stuff",
"valid": true
},
{
"description": "a valid puny-coded URL ",
"data": "http://xn--nw2a.xn--j6w193g/",
"valid": true
},
{
"description": "a valid URL with many special characters",
"data": "http://-.~_!$&'()*+,;=:%40:80%2f::::::@example.com",
"valid": true
},
{
"description": "a valid URL based on IPv4",
"data": "http://223.255.255.254",
"valid": true
},
{
"description": "a valid URL with ftp scheme",
"data": "ftp://ftp.is.co.za/rfc/rfc1808.txt",
"valid": true
},
{
"description": "a valid URL for a simple text file",
"data": "http://www.ietf.org/rfc/rfc2396.txt",
"valid": true
},
{
"description": "a valid URL ",
"data": "ldap://[2001:db8::7]/c=GB?objectClass?one",
"valid": true
},
{
"description": "a valid mailto URI",
"data": "mailto:[email protected]",
"valid": true
},
{
"description": "a valid newsgroup URI",
"data": "news:comp.infosystems.www.servers.unix",
"valid": true
},
{
"description": "a valid tel URI",
"data": "tel:+1-816-555-1212",
"valid": true
},
{
"description": "a valid URN",
"data": "urn:oasis:names:specification:docbook:dtd:xml:4.1.2",
"valid": true
},
{
"description": "an invalid protocol-relative URI Reference",
"data": "//foo.bar/?baz=qux#quux",
Expand All @@ -48,6 +108,16 @@
"description": "an invalid URI though valid URI reference",
"data": "abc",
"valid": false
},
{
"description": "an invalid URI with spaces",
"data": "http:// shouldfail.com",
"valid": false
},
{
"description": "an invalid URI with spaces and missing scheme",
"data": ":// should fail",
"valid": false
}
]
},
Expand Down
9 changes: 9 additions & 0 deletions jsonschema/_validators.py
Original file line number Diff line number Diff line change
Expand Up @@ -391,6 +391,15 @@ def type(validator, types, instance, schema):
yield ValidationError(_utils.types_msg(instance, types))


def type_draft6(validator, types, instance, schema):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too I think draft 6 taught us we need to widen the type interface. I put some notes in #364, maybe have a look there if you don't mind?

# Draft 6 changes integer type to include floats with no fractional part
if isinstance(instance, float) and "integer" in types and instance.is_integer():
return

for e in type(validator, types, instance, schema):
yield e


def properties(validator, properties, instance, schema):
if not validator.is_type(instance, "object"):
return
Expand Down
33 changes: 19 additions & 14 deletions jsonschema/validators.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ def _validates(cls):
return _validates


def create(meta_schema, validators=(), version=None, default_types=None):
def create(meta_schema, validators=(), version=None, default_types=None, scope_key="id"):
Copy link
Member

@Julian Julian Nov 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this makes me a bit uneasy (which is why I didn't do it immediately myself, I was trying to convince myself what the right thing was...).

I think to be safe (future-proof) we need to make the interface a bit wider -- i.e., rather than just specifying the name of a key to use for scoping, we should provide something like a function here that the validator will use to detect scope (and yeah to make the current implementations just be lookups). I'm worried about the potential for some more complicated process, and once we already need an argument to do it, might as well have it be a generic one. Lemme know if that makes sense.

We also need to have direct tests for this (i.e. separate from any specific draft support).

Copy link
Contributor Author

@bsmithers bsmithers Nov 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lemme know if that makes sense.

You mean specifying a validator for id or $id to handle the scope change?

I did consider that, however I wasn't sure the corresponding architectural change would be nice and wondered if a simpler solution is better at least until such time as there are future differences. But I see your point.

We also need to have direct tests for this (i.e. separate from any specific draft support).

Agreed on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be slightly narrower than a validator -- basically it is a function to replace the calls just to someschema.get("id") -- so, scope_of(schema), say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And really, yeah this is tightly related to #346 unfortunately, which is to say, I think the even better thing is to come up with the right interface for that, provide the RefResolver class you want to use to this function, and have a function on that interface that is "scope of". I just can't decide whether it's obvious enough how that looks quite yet, and how much work it is to change it :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provide the RefResolver class you want to use to this function

That was kinda where my initial thought was with this too. Unless I'm misunderstood, my concern is that a user may specify a RefResolver instance when calling .validate() -- this would mean a user is then specifying the draft number twice, i.e. Draft4Validator and Draft4RefResovler, which seemed unpleasant? I'm not up to speed enough to determine if that's a narrow enough issue though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right so I think the idea there would be to stash what is passed in here as the resolver callable onto the validator interface -- i.e., the newer way to construct resolvers would become DraftSomethingValidator(resolver=DraftSomethingValidator.Resolver(...))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright well I think I've got a bit of a better picture of the direction you'd like this to go in. I'll try and take a look at the related issues and then see if I can work towards something. Of course I shan't be offended if you're actively working on these things and prefer effort elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideas I have, time, much less so :(, especially given that the ideas are often only partially baked and need refinement or experimentation to see whether they fully fit -- so definitely definitely would appreciate the help!

I think the one thing that seems most clear at the moment is the stuff I mentioned about type, so I might recommend that one as the first self-contained area to poke at -- widen the type interface, get that merged, and then come back here and wiggle this onto the new one.

But there are many many paths forward and yeah I super appreciate any help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a look at the type stuff [1] - it's WIP and not yet ready for a PR, but if you have the time to give some feedback before I take it further it'd be really appreciated.

In particular:

  • Rather than the single is_type injected function discussed in WIP: Support namedtuple validation #365, I've taken the approach of a function per type. I figure this is more composable for extensions
  • I presume a need to maintain backwards compatibility with the old default_types and per-instance types properties. I believe this does so (tests using this pass) - but ideas appreciated
  • I feel this works well for 'simple' types, for example handling the change in integer in draft6 and e.g. a hypothetical scenario where strings encoding ints are valid under integer [2]. However, some difficulty remains for something like Support namedtuple #364 to extend object to allow namedtuple - a user would have to provide custom validators for all object-based checks, e.g. in my test case [3]. Wether or not this is an acceptable level of work for a custom type or we should look to facilitate this I'm not sure.

[1] draft6...bsmithers:types
[2] https://github.com/bsmithers/jsonschema/blob/types/jsonschema/tests/test_types.py#L56
[3] https://github.com/bsmithers/jsonschema/blob/types/jsonschema/tests/test_types.py#L95

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So! I think that looks like a great start -- if I can inject one idea there (which is only 80% baked I think, so lemme know if I go off a rail) -- I think the right thing here is to take a cue from what's happening on the ref resolver and format checker sides -- to introduce a TypeChecker class, one that you associate an instance of with a validator class in create and extend.

The one major mistake I'd love to correct would be to not have mutable APIs for that -- so, to talk about the specific namedtuple case, the API I'd expect would look roughly like:

def object_or_namedtuple(value):
    return Draft4Validator.type_checker.is_type("object") or getattr(value, "_replace", None) is not None

say, and then Draft4Validator.type_checker.redefine(object=object_or_namedtuple) which should return a new instance (which you'd use via Draft4Validator(type_checker=that) where the appropriate redefinition has happened (attr.evolve + a pyrsistent.pmap of checkers are the pieces we'd use to make that happen).

You can then also have type_checker.redefine(**types) to convert from the old format, and I think the above does seem like a reasonable amount of work to do the namedtuple case.

How does that sound to you (and as usual apologies for spitting information out slowly here after you've already gotten started :), as usual with these things what seems right only starts to come out and evolve after progress has begun).

if default_types is None:
default_types = {
u"array": list, u"boolean": bool, u"integer": int_types,
Expand All @@ -68,6 +68,7 @@ class Validator(object):
VALIDATORS = dict(validators)
META_SCHEMA = dict(meta_schema)
DEFAULT_TYPES = dict(default_types)
SCOPE_KEY = scope_key

def __init__(
self, schema, types=(), resolver=None, format_checker=None,
Expand All @@ -76,12 +77,13 @@ def __init__(
self._types.update(types)

if resolver is None:
_schema = schema
if schema is True:
resolver = RefResolver.from_schema({})
_schema = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that now you will mis-report schemas.

One major gap in draft 6 support is writing some tests for the new schema coersion behavior (of bools) -- specifically, this will misreport what the schema was in ValidationError.schema.

elif schema is False:
resolver = RefResolver.from_schema({"not": {}})
else:
resolver = RefResolver.from_schema(schema)
_schema = {"not": {}}

resolver = RefResolver(_schema.get(scope_key, u""), _schema)

self.resolver = resolver
self.format_checker = format_checker
Expand All @@ -101,7 +103,8 @@ def iter_errors(self, instance, _schema=None):
elif _schema is False:
_schema = {"not": {}}

scope = _schema.get(u"$id")
scope = _schema.get(scope_key)

if scope:
self.resolver.push_scope(scope)
try:
Expand Down Expand Up @@ -149,11 +152,8 @@ def is_type(self, instance, type):
raise UnknownType(type, instance, self.schema)
pytypes = self._types[type]

# FIXME: draft < 6
if isinstance(instance, float) and type == "integer":
return instance.is_integer()
# bool inherits from int, so ensure bools aren't reported as ints
elif isinstance(instance, bool):
if isinstance(instance, bool):
pytypes = _utils.flatten(pytypes)
is_number = any(
issubclass(pytype, numbers.Number) for pytype in pytypes
Expand Down Expand Up @@ -181,6 +181,7 @@ def extend(validator, validators, version=None):
validators=all_validators,
version=version,
default_types=validator.DEFAULT_TYPES,
scope_key=validator.SCOPE_KEY
)


Expand Down Expand Up @@ -279,10 +280,11 @@ def extend(validator, validators, version=None):
u"properties": _validators.properties,
u"propertyNames": _validators.propertyNames,
u"required": _validators.required,
u"type": _validators.type,
u"type": _validators.type_draft6,
u"uniqueItems": _validators.uniqueItems,
},
version="draft6",
scope_key="$id",
)


Expand Down Expand Up @@ -356,7 +358,7 @@ def __init__(
self._remote_cache = remote_cache

@classmethod
def from_schema(cls, schema, *args, **kwargs):
def from_schema(cls, schema, scope_key="id", *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this also makes me uneasy -- up until now, ref resolvers were completely separate from drafts (because the behavior was the same across them).

I don't love this idea -- now a caller needs to know implementation details of the draft they're using.

I'd much rather encapsulate that, even if that means that we now need to have separate ref resolvers per draft (and we'd also need to preserve exactly the old interface for a class called just RefResolver). Annoying, but I think the whole change is annoying :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this also makes me uneasy -- up until now, ref resolvers were completely separate from drafts (because the behavior was the same across them).

The problem is that they're not, because that from_schema method needs to understand id. My initial thought was actually to drop that method completely, but of course that's extremely unpleasant for users in the wild.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but what I mean is -- as far as RefResolver is concerned, when presented with any schema, there is an algorithm you can use to unambiguously know this (but now this introduces the possibility of error).

Even now, you have this algorithm:

  • Look for a $schema key
  • If you find one, you (RefResolver) know what the correct scope key is
  • If not, you use the "package" default (which I'm not too happy with how we're managing currently :/, but it's a thing)

You can take that algorithm, and layer it on top of an "explicit" one -- say a Draft3Draft4RefResolver and a Draft6RefResolver alongside a resolver_for_schema that calls the right thing as above.

(I know I'm being terse still, lemme know if that's unclear)

"""
Construct a resolver from a JSON schema object.

Expand All @@ -366,13 +368,17 @@ def from_schema(cls, schema, *args, **kwargs):

the referring schema

scope_key:

the attribute used to adjust the scope

Returns:

:class:`RefResolver`

"""

return cls(schema.get(u"$id", u""), schema, *args, **kwargs)
return cls(schema.get(scope_key, u""), schema, *args, **kwargs)

def push_scope(self, scope):
self._scopes_stack.append(
Expand Down Expand Up @@ -458,7 +464,6 @@ def resolve_fragment(self, document, fragment):
a URI fragment to resolve within it

"""

fragment = fragment.lstrip(u"/")
parts = unquote(fragment).split(u"/") if fragment else []

Expand Down