Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full inlined schema support #59

Closed
opbokel opened this issue Mar 30, 2020 · 18 comments · Fixed by #68
Closed

Full inlined schema support #59

opbokel opened this issue Mar 30, 2020 · 18 comments · Fixed by #68
Assignees
Labels
enhancement New feature or request

Comments

@opbokel
Copy link

opbokel commented Mar 30, 2020

Hello, first would like to say that you did an amazing job with this library.

I am using your library at Fedex, and we have a use case that may seem not very usual, but nonetheless a interesting functionality to support, and apparently simple.

We need the generated schema to be fully inlined, this means, never user a $ref.

The reason is that we add dynamic rules on top of the static rules generated for the class, making really hard to do "allOf" compositions when a reference have another reference inside. A example would be the address class that each country have different postal code pattern. So, sender and receiver fields might have different validation rules for the same class.

Also the front end "walks" the schema, retrieving enum values and required fields to setup the UI.

We generate all the json schema by hand, but we want to use your library and just add the dynamic rules by hand.

I will fork your library and start working on it. Please let me know if you have any suggestion of how to accomplish it, and if you are ok with me opening a PR soon.

Kind Regards,
Octavio.

@CarstenWickner
Copy link
Member

CarstenWickner commented Mar 30, 2020

Hi Octavio,

Thanks for your kind words.

Sounds like a normal use-case to me that others would have too. So, I'd appreciate a PR.
My suggested approach would be to:

  • add a new Option (making it clear in the JavaDoc that this is only viable if there is not a single circular reference),
  • change the SchemaGenerator.buildDefinitionsAndResolveReferences() method that is being invoked at the very end of the schema generation. There is a single line where you'd want to consider the new Option then:
boolean referenceInline = (references.isEmpty() || (!createDefinitionsForAll && (references.size() + nullableReferences.size()) < 2))
                    && !mainSchemaKey.equals(definitionKey);
  • the only tricky bit I see would be to add a safe-guard that recognizes when there is a circular reference and does add a minimum number of definitions then anyway (instead of running into a stack overflow).

Apart from that, it'd be interesting to hear what kind of "dynamic rules" you have and whether it would make sense to cater for those in some standardized way here as well.


Just a side-note: I also encountered the challenge of resolving those $references in one of my own frontend projects and managed to handle it at least for the simple kind of references produced here. Maybe you could have a look there as well to get some inspiration, if you would want to consider keeping the schema small by using $references after all: CarstenWickner/react-jsonschema-inspector

@CarstenWickner
Copy link
Member

Hi Octavio,

I would suspect the stack overflow to occur at the moment where the schema is serialized (e.g. by calling JsonNode.toString()), not in the schema generator itself.
That was the main motivation behind adding an entry to the definitions as soon as there is more than one mentioning of the same type (seemed simpler than checking for circular references).

Without knowing a bit more of the surrounding code, it's hard to judge what an appropriate way of doing it via this library is, but I guess a starting point would be something like this:

SchemaGeneratorConfigBuilder configBuilder = new SchemaGeneratorConfigBuilder(objectMapper, SchemaVersion.DRAFT_2019_09, OptionPreset.PLAIN_JSON);
configBuilder.forFields()
    .withCustomDefinitionProvider((field, context) -> {
            if (!field.getType().isInstanceOf(Address.class)) {
                return null;
            }
            ObjectNode addressSchema = context.createDefinition(field.getType());
            // make your custom changes here
            
            return new CustomPropertyDefinition(addressSchema);
    });

This would ensure that the definition for every Address field will be inline and you can make individual adjustments on each one.

@CarstenWickner CarstenWickner added the enhancement New feature or request label Mar 30, 2020
@opbokel
Copy link
Author

opbokel commented Mar 30, 2020

PS: This is the previous message with "pseudo code". I had to delete the previous message, even without any sensitive information, had Fedex code.


I had not thought about the stack overflow problem. I will keep in mind. I will also investigate if there is a alternative way to solve the issue.
Bellow one simplified used case:

We have the json schema modeled as Java classes. From this model we build a standard definition for address (full manual process), and build dynamic validations on top, including enums with possible values and post code pattern:

private ObjectProperty buildAddressSchema(CountryDetails countryDetails, Optional<ImmutableSet<Country>> countries) {
    var schema = DefaultSchemas.get(Address.class);
    addPostalCode(schema, countryDetails);
    addStateEnum(schema, countryDetails)
    return schema;
}

private ObjectProperty addPostalCode(ObjectProperty schema, CountryDetails countryDetails) {
    schema = schema.addProperty("postalCode", PostalCodeFormatMapper.getProperty(countryDetails.getPostalCodeFormats()));
    if (!countryDetails.requiredPostalCode()) {
        schema = schema.makeNotRequired("postalCode");
    }
    return schema;
}

PS: We have many more cases, this is one of them.

@CarstenWickner
Copy link
Member

CarstenWickner commented Mar 30, 2020

Fair enough. 😉

My previous suggestion still stands. Using context.createDefinition() (or context.createStandardDefinition()) within a configBuilderPart.withCustomDefinitionProvider() would allow you to ensure that the respective schema is always inline and give you the freedom to add some more validation attributes on top of the automatically generated default schema.

@opbokel
Copy link
Author

opbokel commented Mar 30, 2020

Hello,

I failed to see while first looking at the library that you could actually use the context to generate the class definition while providing a custom definition. At first glance I thought that I would have to provide the full class definition by hand.

I will properly investigate the possibility using a custom provider, and get back to you as soon as possible, properly closing the issue if no changes need to be made. If it works, I will also post a pseudo code of how to accomplish it in our use case.

Thank you!

@opbokel
Copy link
Author

opbokel commented Mar 31, 2020

I am posting here just to reach you, this is not directly related to the issue.

I saw that this functionality was committed on version 4.8.0, but I cannot update. Basically gradle or sbt cannot find the parent pom. The parent pom is not publish anywhere in the repo path:
https://repo.maven.apache.org/maven2/com/github/victools/

Execution failed for task ':compileJava'.

Could not resolve all files for configuration ':compileClasspath'.
Could not resolve com.github.victools:jsonschema-generator:4.8.0.
Required by:
project :
> Could not resolve com.github.victools:jsonschema-generator:4.8.0.
> Could not parse POM https://repo.maven.apache.org/maven2/com/github/victools/jsonschema-generator/4.8.0/jsonschema-generator-4.8.0.pom
> Could not find com.github.victools:jsonschema-generator-parent:4.8.0.

PS: Downloading the project, running mvn install, and configuring gradle to use also the mavenLocal() worked. A folder jsonschema-generator-parent with the parent pom is created in the same level as the the other projects, and the parent pom is found.

@CarstenWickner
Copy link
Member

Thanks for letting me know. I‘ll publish the parent pom later tonight then.

@CarstenWickner
Copy link
Member

Released v4.8.1 now, including parent pom.xml and tiny bug fix in the swagger module.

@opbokel
Copy link
Author

opbokel commented Apr 2, 2020

Hello,

If I use a customDefinitionProvider for the fields, with CustomDefinition.AttributeInclusion.YES, annotated fields have precedence over my dynamic definition. For example, if provide a custom definition to the name field like {"minLength" = 5}:

class Contact{
@jsonschema(minLength = 3, maxLength = 20)
String name;
}

The final result will be {"minLength" = 3, "maxLength" = 20}.

With CustomDefinition.AttributeInclusion.NO it will be {"minLength" = 5}

What would be ideal is to get {"minLength" = 5, "maxLength" = 20}. (Dynamic defined fields having precedence, but not fully overriding the annotations, just in the defined attributes)

Is there any way to accomplish this?

@CarstenWickner
Copy link
Member

CarstenWickner commented Apr 2, 2020

Hi @opbokel,

The custom definitions are mostly meant as a last resort if you need to make structural changes that are otherwise not supported.

By design, you have multiple ways of achieving the same thing and this is certainly possible.
Generally, I'd suggest to use the configuration with the smallest applicable scope:

  1. Prefer to use specific configurations like .withStringMinLengthResolver().
  2. If the particular attribute you're after does not have such a dedicated configuration option, fall-back on .withInstanceAttributeOverride() or .withTypeAttributeOverride() if you want to have the last say in what ends up in the schema.
  3. As a last resort, you can use custom definitions. Worst case, you'd have to use AttributeInclusion.NO and in there add the "maxLength": 20 yourself.

In your given example, options 1 or 2 should allow you to achieve what you're after without going down the "custom definition rabbit hole". 😉


That being said, I can also look into preserving already existing attributes there. I believe that's how the per-type custom definitions are working already. This seems to be an inconsistency in the new custom property definitions.

@opbokel
Copy link
Author

opbokel commented Apr 2, 2020

Hello @CarstenWickner

I am probably using you library in a way you never thought of.

The idea that I am trying to accomplish is to have full control over any field, because the rules are complex, and derived many times from asynchronous calls to external systems.

Let's say you have the classes

class Address {
    String postcode
    ...
}

class Shipment {
    Address sender
    Address receiver
    ...
}

I want to be able to auto derivate the json schema, and add different rules for the same Object.

If sender and receiver are from different countries, they will have different post code formats, for example. Let's say the request to get the schema is something like GET shipment/schema?senderCountry=BR&receiverCountry=NL

I was able to accomplish this combining withInstanceAttributeOverride (to allow my attributes to have precedence, but still get attributes from the annotations), and withCustomDefinitionProvider, to actually force inline definitions and fully control my relative position.

With this you can define something like a Map (".sender.postcode" -> {pattern:"BR-Pattern"}, ".receiver.postcode" -> {pattern:"NL-Pattern"}), with absolute control of who gets overridden. Of course, this means that I have to define one generator each time because the dynamic rules change.

The pseudo code is something like bellow:

    public JsonNode generate(final Type mainTargetType, final ImmutableMap<String, SchemaProperty> dynamicPropertyMap, Type... typeParameters) {
        SchemaGeneratorConfigBuilder configBuilder = baseSchemaBuilder();
        final LinkedList<String> pathStack = new LinkedList<>();
        configBuilder.forFields().withInstanceAttributeOverride((node, field) -> {
            String fullPath = pathStack.stream().reduce("", (s1, s2) -> s1 + "." + s2) + "." + field.getName();
            Optional.ofNullable(dynamicPropertyMap.get(fullPath)).ifPresent(property ->
                    node.setAll((ObjectNode) objectMapper.valueToTree(property))
            );
        });
        configBuilder.forFields().withCustomDefinitionProvider((field, context) -> {
            pathStack.add(field.getName());
            ObjectNode definition = context.createDefinition(field.getType());
            pathStack.removeLast();
            return new CustomPropertyDefinition(definition);
        });
        SchemaGenerator generator = new SchemaGenerator(configBuilder.build());
        return generator.generateSchema(mainTargetType, typeParameters);
    }

If this is the kind of functionality you would like to see in your library, I would be glad to contribute.
Thanks, Octavio

@CarstenWickner
Copy link
Member

Hi @opbokel,

I see where you're coming from. Definitely an interesting (and probably not unique) use-case.
Your (pseudo) solution looks kinda hacky though. This relies a lot on the order of certain internal calls, which is dangerous since my tests usually only care about the result in the end but I might break this kind of solution in future releases – at least I cannot guarantee anything.

The challenge I see with this kind of "path" being maintained is that I'm not sure it would work in all use-cases – especially in combination with other "real" custom property definitions because those are effectively disabled by your (pseudo) solution.
Also, the re-usage of type definitions would mean that the collected path is only for the very first time a type is being encountered and never again afterward (unless everything is inline).

Btw. I did fix the bug in regards to attributes from custom definitions being overridden now (#64) – not yet released though.

@CarstenWickner
Copy link
Member

CarstenWickner commented Apr 2, 2020

Hi @opbokel,

as an additional note: if you want to ensure that everything is inline (as per the original title of this issue), you could add the following CustomDefinitionProviderV2:

CustomDefinitionProviderV2 inlineAllTheThings = new CustomDefinitionProviderV2() {
    public CustomDefinition provideCustomSchemaDefinition(ResolvedType javaType, SchemaGenerationContext context) {
        return new CustomDefinition(context.createStandardDefinition(javaType, this), CustomDefinition.INLINE_DEFINITION);
    }
};
configBuilder.forTypesInGeneral().withCustomDefinitionProvider(inlineAllTheThings);

Please beware: As mentioned before, this will result in an endless loop if you have only a single circular reference anywhere.

@CarstenWickner
Copy link
Member

Fyi, the fix regarding attributes from custom definitions has now been released in v4.9.0 – just in case that makes your implementation simpler. 😉

@opbokel
Copy link
Author

opbokel commented Apr 3, 2020

Hello @CarstenWickner

You are amazing, in 4.9.0 my first solution works again with only with one function, and only defined schema properties are overwritten (pseudocode):

       final LinkedList<String> pathStack = new LinkedList<>();
       configBuilder.forFields().withCustomDefinitionProvider((field, context) -> {
           pathStack.add(field.getName());
           String fullPath = pathStack.stream().reduce("", (s1, s2) -> s1 + "." + s2);
           ObjectNode definition = Optional.ofNullable(dynamicPropertyMap.get(fullPath)).map(property ->
                   (ObjectNode) objectMapper.valueToTree(property)
           ).orElseGet(() -> context.createDefinition(field.getType()));
           pathStack.removeLast();
           return new CustomPropertyDefinition(definition);
       });
   }

The thing is, I need to be able to override on a field level, so I need (at least I think), to make it in a field level.

I also understand the challenge of maintaining a path. This solution of mine only works because every definition pass through here, and I currently only modifying "leafs", everything is inlined, we have no circular dependencies and context.createDefinition is behaving consistently, with recursion like calls, so my path stack is behaving correctly in all tests.

If you have one single reference ($ref), everything would be broken. But setting the behaviour of a single property of a instance would be also meaningless if you have references.

It is a hard problem to solve and to nicely integrate into your code base, but I will keep thinking and be glad to help implementing / discussing a solution.

PS: Since tests are written for this solution, we will probably know if a update breaks the solution. It looks like a hacky solution, and indeed it is, but I thought and looked of several solutions and could not find any less hacky considering the crazy requirements we have with the json schema.

@CarstenWickner
Copy link
Member

Hi @opbokel,

Thank you for your praise. Much appreciated.

I'm introducing a new Option.INLINE_ALL_SCHEMAS to achieve what you originally requested here. The implementation uses your approach to detect circular references – thanks for that!

public class InlineSchemaModule implements Module, CustomDefinitionProviderV2 {

    private final Deque<ResolvedType> declaringTypes = new LinkedList<>();

    @Override
    public void applyToConfigBuilder(SchemaGeneratorConfigBuilder builder) {
        builder.forTypesInGeneral().withCustomDefinitionProvider(this);
    }

    @Override
    public CustomDefinition provideCustomSchemaDefinition(ResolvedType javaType, SchemaGenerationContext context) {
        if (this.declaringTypes.contains(javaType)) {
            throw new IllegalArgumentException("Option.INLINE_ALL_SCHEMAS cannot be fulfilled due to a circular reference to "
                    + context.getTypeContext().getFullTypeDescription(javaType));
        }
        if (context.getTypeContext().isContainerType(javaType)) {
            // container types are being in-lined by default and only handle container-item-scope if the container itself is not a custom definition
            return null;
        }
        this.declaringTypes.addLast(javaType);
        ObjectNode definition = context.createStandardDefinition(javaType, this);
        this.declaringTypes.removeLast();
        return new CustomDefinition(definition, true);
    }
}

As for the path, I'm afraid that would only work reliably in those special circumstances where everything is in-lined (like in your use-case). Therefore, I'd refrain from adding that as a standard feature for the time being and rather expect that to be done via configuration – which you have proven to be possible.
After all, that's the whole point of allowing such flexible ways of configuration: enabling even "crazy" requirements to be met without requiring complex logic inside the library itself.

Thank you for the constructive exchange and the insight into what the library is being used for – always nice to see it being used with these Open Source projects. 😄

I'll close this issue here accordingly, as the original request will be met in the next 4.10.0 release.
Feel free to open new issues if there's more from your side.

@opbokel
Copy link
Author

opbokel commented Apr 6, 2020

Thanks. I am looking forward to 4.10.0!

@CarstenWickner
Copy link
Member

New Option was released in v4.10.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging a pull request may close this issue.

2 participants