-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SchemaComparator #157
SchemaComparator #157
Conversation
…property checking
Let's start with the simplest thing that works, and then we can refine :) I think for a start a very simplistic matching of schemas listed in |
* | ||
* Determining compatibility (or incompatibility) of arbitrary schemas with certainty is non-trivial, or outright | ||
* impossible in general. For this reason, this method works in a "best effort" manner, assuming that the schemas | ||
* match one of the typical schema patterns generated by frameworks like `tapir`. In more complex situations, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libraries! ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pardon
apispec-model/src/main/scala/sttp/apispec/validation/SchemaComparator.scala
Outdated
Show resolved
Hide resolved
def noSchema: Nothing = | ||
throw new NoSuchElementException(s"could not resolve schema reference ${s.$ref.get}") | ||
|
||
normalize(named.getOrElse(name, noSchema), named) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we dereference all local schemas, then in an endpoint, if a schema is being referenced by many other schemas, a single change in that schema will cause a lot of incompatibility errors? unless we de-duplicate, e.g. by using a set, and they get combined?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a cache that prevents wasting time on comparing the same pairs of schemas multiple times, but:
- cached issues will be duplicated in the final list (or tree, actually) of issues, so for better presentation we would need some additional deduplication mechanism - it could be done with some form of "references" to errors similar to schema references, or alternatively - implemented purely in the presentation layer that displays issues to the user
- the cache is currently not reusable between toplevel schema comparisons - it would need to be extracted to some higher layer for full OpenApi comparisons
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, for full OpenAPI comparisons (which is our goal), we will have to compare schemas for each endpoint. But won't simply adding the issues to a set solve the problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issues are not a flat list - they form a tree like structure that allows you to "track the path" to incompatibilities within schemas (see SubschemaCompatibilityIssue
hierarchy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah I must have missed that ... checking again ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, there are several different ways we could deal with this duplication. This largely depends on how exactly we want the incompatibilities to be presented to the user. So far, I have only implemented a simple human-readable description
for every issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good start. We'll probably want to just return the list of the issues initially.
$comment = None, | ||
title = None, | ||
description = None, | ||
default = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't defaults affect comparisons? if a writer has a schema with default = 5, and a reader with default = 7, the deserialisation might be different?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to JSON Schema spec, the default
keyword is an annotation, which is "for documentation and user interface display purposes".
Changing the default value cannot make any previously valid data invalid according to a new schema, so technically it's not an incompatibility. Whether this can be an incompatibility on a higher, semantic level is debatable. I can imagine situations where it is and where it isn't.
For example, an integer field in a request may have had a default value of 0, but then the server decided to make this field nullable, with a default value of null
, so that it can distinguish between passing 0 explicitly and not passing anything. The server can do this in a fully compatible way. It depends on its implementation.
In general, compatibility on a "semantic" level can be broken in many ways just by changing the implementation of the server, even without changing the format at all. I assumed that here we focus purely on syntactic incompatibilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, sounds reasonable, thanks for the explanation :)
Nice work - well beyond what I imagined originally, but very thorough :). |
Required for softwaremill/tapir#3645
SchemaComparator
compares two schemas for compatibility and returns a list ofSchemaCompatibilityIssue
s.Currently, the comparator only understands a fixed set of common schema patterns, likely to be generated by
tapir
from common Scala types:type
which is notobject
orarray
+ simple assertions, e.g.minimum
,maximum
etc.)array
withitems
+ simple array assertions)array
withprefixItems
+ simple array assertions)object
withadditionalProperties
+ simple object assertions)object
withproperties
, possiblerequired
anddependentRequired
)oneOf
/anyOf
with simple local references anddiscriminator
)oneOf
/anyOf
, no discriminator)The following JSON Schema keywords are currently not understood at all:
$schema
,$vocabulary
,$id
,$anchor
,$anchor
,$dynamicAnchor
,$dynamicRef
,$defs
(maybe some of them can be ignored like annotations?)allOf
,not
(except in "nothing" schema)if
,then
,else
contains
,maxContains
,minContains
,unevaluatedItems
patternProperties
,unevaluatedProperties
,dependentSchemas
If schemas don't fall into any of the above categories, or contain any of the mentioned unsupported keywords, they are considered opaque and compared only for plain equality (with annotations stripped). If they are not equal a generic "fallback" error is returned that indicates
SchemaComparator
's inability to determine compatibility between schemas (they may or may not be compatible).