Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add a v2 crd generator based on jackson #5949

Merged
merged 17 commits into from
May 3, 2024

Conversation

shawkins
Copy link
Contributor

@shawkins shawkins commented Apr 24, 2024

Description

closes: #5948

also removes v1beta1 from the new code

This is not yet ready to merge. For the tests to pass we need to:

  • confirm the JsonFormat shape behavior from CRD-Generator v7 - Orga #5942 (comment) - for now the types were changed to string
  • The required list was previously unordered, now it is. @andreaTP are there any objections to making that the new expected behavior?
  • The printer column type is taken directly from the property - but in the test it's expecting a string instead of an array type for one of the columns. What is the expectation here - is string required, or was that just a limitation of the previous logic? cc @andreaTP - EDIT: updated to use string for array / object and emit a warning.
  • It now better understands the anygetter / setter logic, so more properties are marked as preserve unknown causing another test to fail. Thinking about this more, this one is not clear cut. This is coming by the way of core model objects, which we include additionalProperties on as a way of letting them be foward compatible - it's not as a way to indicate what the schema actually should be. It could be using the custom PreseveUnknown annotation is the way to go here, rather than detecting the situation ourselves. cc @andreaTP

Recap of some other possible follow-on issues:

Approval test changes came in after the fork was created, those would need to be added again here, or in a subsequent pr.

Deprecate:

  • Using CustomResource instance methods e.g. getSingular to create the crd.
  • fabric8's Required, Default, Pattern? Required can be handled by JsonProperty.required - also has handling in the Validation logic (see below). Default can come from JsonProperty.defaultValue, and Pattern can come from JsonFormat.pattern (not currently built-in to the Jackson schema logic).

Other possible work:

  • Inferring the default value from creating a bean instance and calling the accessor
  • the use of ValidationSchemaFactoryWrapper to discover javax validations (or targeting jakarta), instead of (or in addition to) our Nullable, Min, Max attributes
  • the way the CustomResourceHandler combines versions seems like it leaves open the possibility that the top level metadata may not align across all annotations, that probably should be checked.
  • Add a description to the PrinterColumn annotation to override the JsonProperty description.

I'll add a changelog entry as well once we're ready.

cc @matteriben @baloo42

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change
  • Chore (non-breaking change which doesn't affect codebase;
    test, version modification, documentation, etc.)

Checklist

  • Code contributed by me aligns with current project license: Apache 2.0
  • I Added CHANGELOG entry regarding this change
  • I have implemented unit tests to cover my changes
  • I have added/updated the javadocs and other documentation accordingly
  • No new bugs, code smells, etc. in SonarCloud report
  • I tested my code in Kubernetes
  • I tested my code in OpenShift

closes: fabric8io#5948

also removes v1beta1 from the new code
@shawkins
Copy link
Contributor Author

I went with crdv2 as the package name - but that can be changed of course.

@baloo42
Copy link
Contributor

baloo42 commented Apr 24, 2024

The required list was previously unordered, now it is.

I think we should handle this as a bug for the v1 generator and also fix it. If not, the new approval tests (#5943) would also fail sometimes. The outcome should be always deterministic.

@shawkins
Copy link
Contributor Author

shawkins commented Apr 24, 2024

@baloo42 I've updated the test for v2 to keep this pr moving forward.

@matteriben based upon #5942 (reply in thread) it sounds like we're moving forward with leaving the type as string for those shape tests, which puts us in a similar situation as sorting the required list - the v1 implementation should be updated to match. We'll track making these changes with #5951.

That just leaves two differences (that we currently know of) to account for.

@andreaTP
Copy link
Member

The required list was previously unordered, now it is.

This is a feature, great!

was that just a limitation of the previous logic?

IIRC that's it, in any case printer columns are just for human consumption, we can solve any issue at a later stage without a significant risk of breaking user code badly.

It could be using the custom PreseveUnknown annotation is the way to go here, rather than detecting the situation ourselves.

Looking at how ppl are using things I would not be surprised if we take the opinionated decision to add x-kubernetes-preserve-unknown everywhere by default ...
That said, I would not forcefully try to follow the current detection logic and just be sure we document the behavior appropriately.

Currently on the phone and I'll be out until Mon, but let me know if any additional input from me is needed to progress here and I'll try to answer asap.

also removing another reference to CustomResource
@shawkins
Copy link
Contributor Author

IIRC that's it, in any case printer columns are just for human consumption, we can solve any issue at a later stage without a significant risk of breaking user code badly.

Ok, to keep this pr moving forward I'll update the results here to expect the actual type.

That said, I would not forcefully try to follow the current detection logic and just be sure we document the behavior appropriately.

The logic here is a little broader - so I'm wondering if it needs to be more limited. The concern is that the object model's intent with using additional properties is to increase compatibility - not that the real schema of ObjectMeta etc. actually calls for preserve unknown. I'm thinking that we'll either need to just look for the PreserveUnknownFields annotation or let it be user configurable in some other way to elect whether anyGetter / anySetter means preserve unknown fields.

@shawkins shawkins marked this pull request as ready for review April 26, 2024 12:36
@shawkins
Copy link
Contributor Author

shawkins commented Apr 26, 2024

Marking as ready for review as all current tests pass. Let me know what needs addressed before this gets committed. This also wires up parallel execution (should likely now default to true) #5952 and exposes two new options for the crd generator:

  • the ability to supply an ObjectMapper. This should be the same instance that is passed to the KubernetesSerialization. If we want to add more of backdoor for exposing the ObjectMapper from an instance of KubernetesSerialization, then we can pass the KubernetesSerialization instead.
  • the ability to control whether we implicitly treat anyGetter / anySetter as preserve unknown fields - we will probably want a better story here eventually.

Recap of some other changes:

  • it works at the property level, rather than the field level, so some crds may have fields removed - but those weren't actually being used. This is also the reason test code made fields public or used the lombok Data annotation - Jackson doesn't consider a private field without accessors serializable.
  • more support for format and pattern based upon what is already supported by jacksonschema. There is still more that could be done here around their validation extension logic
  • RawExtension and HasMetadata interfaces will be mapped to embedded resources
  • the logic resolves the jackson schema for nested classes only once, then we'll reuse that schema from the ResolvingContext
  • the abstraction to support multiple CRD versions has been left in, but could be removed if we want to revisit it later when needed
  • also removed the decorators as we can do the same thing more directly with builders.

@metacosm this should be good now to test out with quarkus

cc @matteriben @baloo42 @manusa @andreaTP

* This class encapsulates the common behavior between different CRD generation logic. The
* intent is that each CRD spec version is implemented as a sub-class of this one.
*/
public abstract class AbstractCustomResourceHandler {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why this is a class instead of an interface?

It's my understanding that we can have a interface CustomResourceHandler and then an implementation V1CustomResourceHandler or CustomResourceHandlerV1, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why this is a class instead of an interface?

Not really. It used to hold common state and have a common method for dealing with printer columns. My intent was to add back the method for printer columns, but didn't bother yet.

It's my understanding that we can have a interface CustomResourceHandler and then an implementation V1CustomResourceHandler or CustomResourceHandlerV1, right?

Correct, it seemed fine to leave in the abstraction layer, used here and for AbstractJsonSchema, but it's not really needed at the moment.

Comment on lines 63 to 64
// TODO: why not rely on the standard fabric8 yaml mapping
public static final ObjectMapper YAML_MAPPER = JsonMapper.builder(new YAMLFactory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, why not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment was lost from @metacosm about how this format seemed different. Haven't looked at it more deeply.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, the format of the default mapper was different from observed CRDs in the wild so I tweaked it to make it closer. Not sure if it's a big deal anymore (or even if it ever was).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two main changes I see are the standard generator includes the explicit start ---, and it quotes string values. The ordering changes aren't really needed as they should be handled by the construction of the crd object.

I agree that we don't need explicit start, the question is does it matter if there's additional quoting? It shouldn't but we can account for one or both of these differences with an KubernetesSerialization.asYaml method that takes some style options - we are just creating the Dump / DumpSettings on every call anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at it more, I don't think we should add yaml options. They really aren't needed for this case either. We could just remove this yaml mapper - the only practical issue is that string comparison of the existing crd output yaml won't work.

Copy link
Collaborator

@metacosm metacosm May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can confirm that the start marker is causing issues, in particular with OLM validation, at least. The marker is what's causing: https://github.com/quarkiverse/quarkus-operator-sdk/actions/runs/8903104778/job/24450250852#step:23:10

I haven't tried applying the CRDs to a cluster, yet, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applying the CRD to the cluster seems to work OK, as expected.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking @metacosm - the explicit start has been removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stripping the marker solved the issue as expected. Thank you!
For reference, here's the successful QOSDK build with the changes from this PR: https://github.com/quarkiverse/quarkus-operator-sdk/actions/runs/8903462629/job/24459014817

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you for testing. The only other thing that could change is supplying the ObjectMapper / KuberenetesSerialization so that we can pick up any changes the user may have made to the mapper configuration

cc @manusa

Copy link
Contributor

@baloo42 baloo42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested it with a few internal projects and looks very good so far.

Beside of the printer columns and the ordering of the required properties list I haven't found any differences to v1.

@shawkins shawkins force-pushed the crd-generator-v7-development branch from f1b5490 to 1ea658d Compare April 29, 2024 18:02
@baloo42
Copy link
Contributor

baloo42 commented Apr 29, 2024

More general feedback:

x-kubernetes-embedded-resource

RawExtension and HasMetadata interfaces will be mapped to embedded resources

👍 Nice

Printer Columns

The printer column type is taken directly from the property - but in the test it's expecting a string instead of an array type for one of the columns. What is the expectation here - is string required, or was that just a limitation of the previous logic?

Kubernetes allows for printer columns only the following types:
boolean,date,integer,number,string

I would suggest to silently filter printer colums on other types. At the moment it produces an invalid CRD.

"Arrays" are only supported if we generate something like the following, which would be an own story:

JSONPath: ".spec.hardware.devices.dynamicDirectPathIODevices[*].deviceID"
type: string

(the target property must by one of the above types)

See also upcoming "FieldSelectors" feature:
https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#field-selectors
https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#type

Preserve Unknown Fields

I like the current behaviour of v2 regarding this topic. It behaves as I would expect it.

if we take the opinionated decision to add x-kubernetes-preserve-unknown everywhere by default ...

In my opinion using x-kubernetes-preserve-unknown everywhere could be "dangerous" because Kubernetes behaves differently. Removed fields in a CR won't get cleaned up (see control-pruning)

@shawkins
Copy link
Contributor Author

I would suggest to silently filter printer colums on other types. At the moment it produces an invalid CRD.

I'd be more in favor of producing a hard error or emitting a warning.

@shawkins shawkins force-pushed the crd-generator-v7-development branch from 1ea658d to abe9a2a Compare April 29, 2024 19:34
also using string instead of object or array types
@shawkins
Copy link
Contributor Author

I would suggest to silently filter printer colums on other types. At the moment it produces an invalid CRD.

I'd be more in favor of producing a hard error or emitting a warning.

Added the warning and extracted common printer column handling.

@shawkins shawkins requested a review from metacosm April 30, 2024 16:41
@shawkins shawkins force-pushed the crd-generator-v7-development branch 2 times, most recently from 6c2ce80 to 8ee3f06 Compare April 30, 2024 19:19
@shawkins shawkins force-pushed the crd-generator-v7-development branch from 8ee3f06 to 7f14fbb Compare April 30, 2024 19:29
metacosm added a commit to quarkiverse/quarkus-operator-sdk that referenced this pull request Apr 30, 2024
@manusa
Copy link
Member

manusa commented May 2, 2024

#5949 (comment)

Is this ready to be merged then?

@shawkins
Copy link
Contributor Author

shawkins commented May 2, 2024

Is this ready to be merged then?

I think so. We can certianly refine things from here. There's quite a few follow-on tasks that have already been logged and more that are mentioned in the PR description.

Copy link

sonarqubecloud bot commented May 2, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
4.5% Duplication on New Code (required ≤ 3%)

See analysis details on SonarCloud

@metacosm
Copy link
Collaborator

metacosm commented May 2, 2024

Yes, this can be merged.

@manusa manusa added this to the 6.13.0 milestone May 3, 2024 — with automated-tasks
Copy link
Member

@manusa manusa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx!

@manusa manusa merged commit 6369322 into fabric8io:main May 3, 2024
19 of 20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[EPIC] CRDGenerator v2
5 participants