Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address issues from #5866 - improve built-in resolving #5877

Merged
merged 1 commit into from
Apr 23, 2024

Conversation

shawkins
Copy link
Contributor

@shawkins shawkins commented Apr 10, 2024

Description

Fixes #5866

Addresses issues from #5866

Relies upon jackson to resolve java. / javax. classes.

Does not yet change the default serialization for time stuff, nor does it address being able to pass in the objectmapper.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change
  • Chore (non-breaking change which doesn't affect codebase;
    test, version modification, documentation, etc.)

Checklist

  • Code contributed by me aligns with current project license: Apache 2.0
  • I Added CHANGELOG entry regarding this change
  • I have implemented unit tests to cover my changes
  • I have added/updated the javadocs and other documentation accordingly
  • No new bugs, code smells, etc. in SonarCloud report
  • I tested my code in Kubernetes
  • I tested my code in OpenShift

@shawkins shawkins force-pushed the iss5866 branch 3 times, most recently from b2fdf7c to c1580cf Compare April 10, 2024 20:17
@shawkins shawkins changed the title task: improve the cycle error message Address issues from #5866 - improve built-in resolving and provide a better cycle stacktrace Apr 10, 2024
@andreaTP
Copy link
Member

Thanks a lot for looking into this @shawkins , I'm not sure about the cost/benefits tradeoff in adding a Jackson dependency just for logging, would it be possible to avoid it somehow?

Alternatively, if it doesn't add much to the complexity of the code, can we consider having it as an optional dependency?

Last, but not least, I'm interested in the output from the ZonedDateTime on Java 21 if you have already tried it out.

@shawkins shawkins force-pushed the iss5866 branch 2 times, most recently from 27e79b2 to 6b620ae Compare April 11, 2024 01:17
@shawkins
Copy link
Contributor Author

Thanks a lot for looking into this @shawkins , I'm not sure about the cost/benefits tradeoff in adding a Jackson dependency just for logging, would it be possible to avoid it somehow?

Jackson is not being used for logging, it's being used to resolve the type for built-in stuff like ZonedDateTime. The changes here allow use to leverage jackson's schema for all of the custom serialiers for java and javax classes. This makes the COMMON_MAPPINGS for things like long and double redundant, but it's fine to leave that logic as is.

Alternatively, if it doesn't add much to the complexity of the code, can we consider having it as an optional dependency?

That is possible, but we'll then hit the cyclic issue for ZonedDateTime and still have incorrect mappings for a lot of things.

Last, but not least, I'm interested in the output from the ZonedDateTime on Java 21 if you have already tried it out.

With these changes it's mapped to number.

@manusa manusa requested a review from andreaTP April 11, 2024 04:49
@manusa manusa added this to the 6.12.0 milestone Apr 11, 2024 — with automated-tasks
Copy link
Member

@manusa manusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx!

Comment on lines +150 to +153
ObjectMapper mapper = new ObjectMapper();
// initialize with client defaults
new KubernetesSerialization(mapper, false);
GENERATOR = new JsonSchemaGenerator(mapper);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels dangerous, especially when combined with Quarkus. Not that I can think of a better solution to make this an instance variable and be able to pass in its dependencies, but mentioning it just in case.

Does a KubernetesSerialization.generateSchema method make sense (considering the KubernetesSerialization class is already quite coupled with Jackson)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels dangerous, especially when combined with Quarkus. Not that I can think of a better solution to make this an instance variable and be able to pass in its dependencies, but mentioning it just in case.

cc @metacosm do you want to pass something in from CRDGeneration? With the way things are currently writen it will take some bigger changes as the JsonSchema instances are themselves static. My assumption here is that we just want some canonical default set of behavior.

Does a KubernetesSerialization.generateSchema method make sense (considering the KubernetesSerialization class is already quite coupled with Jackson)?

We had been resisting directly exposing jackson specific apis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreaTP before I take this further, can I get some assurance that you will actually consider this approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to oppose to this effort, at the same time I'm not sure about it:

  • adds an external dependency
  • the type resolution logic is currently fully coming from sundrio, this approach effectively duplicates the source of truth for specific types
  • it binds the type resolution to the serialization format/configuration which is effectively a change from the current behavior

Those details are going to make bug resolution in this codebase more tricky, but happy to be proven wrong.
Possible next steps to be evaluated:

  • we resolve all the types(not only the "complex") using Jackson(would it be possible to remove the dependency on sundrio in this case?)
  • we bake and harmonize this type resolution logic into a fully recursive algorithm(maybe in the sundrio codebase?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adds an external dependency

An additional dependency from a family of dependencies on which we already rely in other parts of the fabric8 client.

the type resolution logic is currently fully coming from sundrio, this approach effectively duplicates the source of truth for specific types

It's unfortunate that sundrio is coming into the picture here at all. The choice to run this as an annotation processor is forcing that decision. Sundrio is a poor source of truth for this task and adds significant complexity and redundant logic with jackson instead of just using reflection and jackson schema. My vote would still be to eventually push the crd generator to something that runs after compilation and eliminate sundrio entirely from this.

it binds the type resolution to the serialization format/configuration which is effectively a change from the current behavior

First you should acknowlege the current behavior is lacking or wrong - it is producing a schema that is invalid in many sitiatutions, not just these built-in types that have additional serdes handling in jackson, but in many other circumstances where it doesn't understand what jackson will actually do.

Second that serialization format/configuration is the more well understood construct / contract for a developer. I can create a bean, run Serialization.asJson and see the output and would then expect the crd generator to create an appropriate schema to represent that. Instead I have to be very aware of limitations or issues with the crd generator and deeply cross check that the output will work, and if not go over a separate set of docs about how to make it work.

Those details are going to make bug resolution in this codebase more tricky, but happy to be proven wrong.

I don't think so. The jackson schema usage here is very limited and to a set of types that we're likely to already handle incorrectly.

we resolve all the types(not only the "complex") using Jackson(would it be possible to remove the dependency on sundrio in this case?)

Only if we move away from this being run as an annotation processor - which I'm fully in favor of.

we bake and harmonize this type resolution logic into a fully recursive algorithm(maybe in the sundrio codebase?

I would not involve sundrio at all. We should be able to do crd generation based upon jackson schema and reflection alone. At worst defincencies in jackson schema (I identified several when we were looking at the default issue) should be need to be addressed in jackson - but that would improve jackson schema for everyone, and we can of course workaround it locally with reflection as needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's the likely direction for the crd generator in general

We don't have a deal on this just yet.

document a workaround for a subtle problem that affects maybe as many as 100 classes

How did you get this number?

with a non-null value against a live kubernetes server

This is surprising, the reported bug happens at generation time, what am I missing?

Copy link
Contributor Author

@shawkins shawkins Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a deal on this just yet.

Which is what I fear in general - you say that you are open to removing sundrio, but in practice I doubt it.

How did you get this number?

It's just a guess - see the comment about looking at the class hierarchy of StdSerializer.

This is surprising, the reported bug happens at generation time, what am I missing?

The bug happens at generation time - but when will you be aware of it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just discuss this in sync, since this (A PR review line comment) isn't probably the best place to do so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but in practice I doubt it

Just to make it clear, don't doubt 🙂 , I do believe that Jackson + Reflection can improve over the current situation.

The bug happens at generation time - but when will you be aware of it?

At generation time, the current status on this bug is that the crd-generator throws a StackOverflow Exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At generation time, the current status on this bug is that the crd-generator throws a StackOverflow Exception.

Sorry if this is not clear, but this fix addresses more than the stackoverflow - see the associated test class. All of those types are currently wrong.

Copy link
Member

@andreaTP andreaTP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

date-time fields should resolve to string and not numeric.

I'm a bit uncomfortable with the type resolution done using Jackson, do we have any alternative?

@shawkins
Copy link
Contributor Author

shawkins commented Apr 12, 2024

date-time fields should resolve to string and not numeric.

That was my first guess too, but ZonedDateTime is actually a numeric by default. If you want it to be a string, then you do need to use some additional mechanism:

  • Change our ObjectMapper to use mapper.disable(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS);
  • SchemaFrom
  • the format specification from the other pr, but you'd also have to change the type to String.
  • Jackson also supports the JsonFormat annotation for this purpose - but obviously the schema generator is not looking for that - there's now a pr for this

I'm a bit uncomfortable with the type resolution done using Jackson, do we have any alternative?

Why are you uncomfortable? It's the clearest indication of what will actually happen at runtime when you serialize. In this case if we don't think that numeric handling is a sensible default - then we should disable SerializationFeature.WRITE_DATES_AS_TIMESTAMPS.

I'd like to avoid if at all possible introducing more duplicate annotations with Jackson - since ultimately users will still need to be responsible for those as well.

Try the JacksonTypeTest without this change - all of those fields will resolve to an object type will a variety of properties. Then look at the class heirarchy for StdSerializer - there's quite a bit of type handling that Jackson does that the crd generator is not aware of. I could also see we were missing handling for Short - we're also missing Byte, byte, short, and byte[] (the primitives are not covered by this pr).

The only alternative I can think of would be to restrict to only a blessed set of known types - and if you use anything other than that default to an exception, rather than emitting what could be erroneous schema information.

@manusa manusa modified the milestones: 6.12.0, 6.13.0 Apr 12, 2024
@shawkins shawkins requested a review from andreaTP April 12, 2024 10:55
@shawkins
Copy link
Contributor Author

@manusa separated off the error message as a separate pr, I'll rebase this and likely remove this test once you have included the test case you were referring to.

The decision seems to be that a backwards compatibility flag is not necessary, but I'm fine with adding one.

Copy link
Member

@andreaTP andreaTP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, LGTM.

The decision seems to be that a backwards compatibility flag is not necessary, but I'm fine with adding one.

Either way fine by me.

@shawkins shawkins changed the title Address issues from #5866 - improve built-in resolving and provide a better cycle stacktrace Address issues from #5866 - improve built-in resolving Apr 21, 2024
@shawkins
Copy link
Contributor Author

@manusa @andreaTP this should be up-to-date now. The description reflects it's not yet changing the default time mappings, and it's still just a static objectmapper - both can be addressed in follow-on issues.

Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
59.7% Coverage on New Code (required ≥ 80%)

See analysis details on SonarCloud

Copy link
Member

@manusa manusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx!

@manusa manusa merged commit b8eeca4 into fabric8io:main Apr 23, 2024
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CRD generation fails when compiled using java version 19 and above
4 participants