Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should unknown keywords be collected as annotations? #698

Closed
handrews opened this issue Dec 18, 2018 · 11 comments · Fixed by #987
Closed

Should unknown keywords be collected as annotations? #698

handrews opened this issue Dec 18, 2018 · 11 comments · Fixed by #987

Comments

@handrews
Copy link
Contributor

Assuming you support annotations (which is related to what output format(s) you support, now that I think of it), should unknown keywords be collected as annotations instead of ignored?

I think we will see more annotations (we've had several requested that don't clearly meet the significance criteria to be added to the standard), and it's an easy default behavior to add.

I don't feel strongly about this, just wondering if this is a sensible default option.

@gregsdennis
Copy link
Member

If this is the direction you want to go, then why have annotation keywords at all?

@pboettch
Copy link

pboettch commented Dec 19, 2018

(removed for clearity)I just saw that there is #687 ... My bad.

@handrews
Copy link
Contributor Author

@gregsdennis I don't understand the objection. Annotations are keywords with a certain behavior, specifically a "safe" behavior that lets an application decide what to do with them.

The specific annotation keywords include rules for how to combine multiple occurrences attached to the same instance location:

  • A location is readOnly if any subschema includes "readOnly": true
  • Same for writeOnly
  • links has extensive complex processing through its sub-keywords
  • default, title, description, and examples do not have any special behavior, and mostly exist (like $defs and $comment) to reserve a standard set of keywords for common behavior

So, if we say to treat unknown keywords as annotations, where the values are all collected by the default mechanism (keep a set of values for each instance location), then all the standard really does with default, title, description, and examples is set their syntax and semantics, on top of that default behavior.

I'm not pushing strongly for this, but it does seem nice that if I throw in a "foo" keyword, at least I'll see it in the annotation output which will often be all I need. Example: I may use an opaque keyword to indicate that a string (URI or otherwise) in an API is intended to be opaque, and therefore clients cannot rely on its internal structure.

I could make a vocabulary including that keyword, and if I want interoperability, or to make opaque automatically get logically ORed like readOnly, then I MUST do so. But if I don't care about interoperability and don't mind doing the OR-ing myself out of the returned output structure, then having a default behavior is easy.

If no one else particularly wants this I will drop it or punt it out to the future.

@gregsdennis
Copy link
Member

So the benefit to having annotation keywords is that it constrains the acceptable values for them (e.g. objects aren't allowed in title), not that it produces an annotation.

That's fair. Just wanted the clarity.

@handrews
Copy link
Contributor Author

That is not how I look at it at all, but I guess the outcome is the same? I really don't know what to say here.

@plankthom
Copy link

I would say that it is an optional feature of a parser/validator implementation that should be configurable by the user.
If supported, a user can configure upfront any additional keywords that need to be collected as annotation, maybe with a reducer strategy (remove duplicates, ordering, first-by-priority, boolean aggregation, uniqueness per location/schema fragment/schema path) that is supported by (or plugged into) the parser/validator.

The parser/validator may by default collect all unknown keywords, but if it does, the user must be allowed to disable it.
The parser/validator would probably reuse the infrastructure for generic support for vocabularies (e.g. as a runtime implicit vocabulary, with reducer mapping)

@handrews
Copy link
Contributor Author

@gregsdennis @plankthom I think it sounds like we should see what feedback we get on vocabularies before we try to specify default behavior. I'm punting this out of draft-08.

@handrews handrews modified the milestones: draft-08, draft-future Dec 31, 2018
@shusson
Copy link

shusson commented Feb 1, 2019

I can see generic annotations being useful for applications that need to generate different outputs. I'm imagining using the annotations as part of the generation logic, e.g this property should not be generated for a particular output.

@handrews
Copy link
Contributor Author

@shusson yes, annotations are particularly useful for generative use cases. And we expect a lot of extension vocabulary work to be done for generative cases, as support for that is currently weak.

So setting a default behavior of treating things as annotations would make such vocabularies work without needing a custom plugin.

Granted, generative tooling will read the schema without an instance so, but there are also uses of annotations with instances (e.g. the way most people end up using format since validation support is so unreliable).

@Relequestual
Copy link
Member

I think we can RECOMMEND that unknown keywords are collected as annotations, and applications SHOULD provide a means to turn it on or off if they do collect annotations for unknown keywords.

@handrews
Copy link
Contributor Author

Yeah, this makes sense. I'd like to RECOMMEND that it be on by default, but perhaps allow that an implementation MAY decline to collect a very large value? I think the main concern is that an unknown applicator being treated as an annotation could cause the output to balloon to problematic sizes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

Successfully merging a pull request may close this issue.

6 participants