Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split manifest generation to produce both v1/v1beta1 CRDs #4489

Merged
merged 28 commits into from
Jun 15, 2021

Conversation

pebrc
Copy link
Collaborator

@pebrc pebrc commented May 12, 2021

This PR moves all CRDs to apiVersion: v1 but keeps v1beta1 around for older Kubernetes versions that we still want to support (all the way back to 1.12 and OCP 3.11)

Upgrades from v1beta1 to v1 will be a bit tricky which is why I have marked this PR as breaking:

  • all-in-one.yaml users will have to replace the CRDs and not use kubectl apply because a meaningful merge of v1beta1 to v1 is not possible. There is however another gotcha with the webhook for which a replace is an illegal operation as it would remove an already populated clusterIP: failed to replace object: Service "elastic-operator-webhook" is invalid: spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset. A workaround is to just reapply the manifest. I therefore added a separate manifest with just the CRDs as a download.
  • Helm: a similar situation: users have to first upgrade the CRDs with the --force option to replace with v1 they should be able to follow this with a regular helm upgrade of the main chart.
  • OLM: in my testing it seems that OLM does the "right thing" and does correctly upgrade from v1beta1 to v1

Docs preview

https://cloud-on-k8s_4489.docs-preview.app.elstc.co/diff

Notes on the implementation:

  • this PR relies on controller-tools 0.6.0-beta which brings an important bugfix for embedded k8s runtime objects metadata
  • introduced global values in the Helm chart to allow access from the dependent CRD chart, this is a potential breaking change but the old values were marked as internal before as well so I guess it is reasonable to expect users to not rely on them
  • The size of the CRDs with schemas and multiple versions is significant (~ 1.6M) We can discuss whether it would make sense to drop the descriptions. So far I was able to apply the manifests with the description as the individual CRDs are still below the max limit.

Test scenarios:

  • OLM: build custom catalog image and upgrade to latest from a previously released version
  • Helm: v1beta1 to v1: install current chart and upgrade from this PR
  • Helm: verify future v1 to v1 upgrade: manually install v1 manifests and adopt them into a 1.6.0 release and try a regular helm upgrade without force
  • YAML: install operator 1.6.0 and upgrade to this PR
  • YAML: run our release upgrade test through multiple operator upgrades and finally upgrade to this PR

@botelastic botelastic bot added the triage label May 12, 2021
@pebrc pebrc force-pushed the api-version-split branch from 307f231 to af8cc1b Compare May 12, 2021 09:56
@botelastic botelastic bot removed the triage label May 12, 2021
@pebrc pebrc force-pushed the api-version-split branch 3 times, most recently from fde9062 to 1a493c5 Compare May 14, 2021 08:23
@pebrc
Copy link
Collaborator Author

pebrc commented May 14, 2021

run full pr build

@pebrc pebrc force-pushed the api-version-split branch from c148988 to 3f3654a Compare May 14, 2021 11:51
@pebrc pebrc marked this pull request as ready for review May 20, 2021 10:49
@pebrc
Copy link
Collaborator Author

pebrc commented May 20, 2021

Marking this ready for review with the caveat that we have a dependency on #4491

@pebrc pebrc force-pushed the api-version-split branch from e3bf85d to 189ee86 Compare May 25, 2021 11:09
@pebrc pebrc force-pushed the api-version-split branch from 0209439 to e777b1e Compare May 28, 2021 09:22
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive work 👍
I only left a few nitpicks, I need some additional time to process and test the PR.

config/crds/v1/patches/elasticsearch-patches.yaml Outdated Show resolved Hide resolved
Makefile Show resolved Hide resolved
config/crds/v1/patches/ent-patches.yaml Outdated Show resolved Hide resolved
config/crds/v1/patches/kustomization.yaml Outdated Show resolved Hide resolved
docs/operating-eck/upgrading-eck.asciidoc Outdated Show resolved Hide resolved
docs/operating-eck/upgrading-eck.asciidoc Outdated Show resolved Hide resolved
docs/operating-eck/upgrading-eck.asciidoc Outdated Show resolved Hide resolved
Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there is a reference to the all in one here.

@pebrc
Copy link
Collaborator Author

pebrc commented Jun 10, 2021

nit: there is a reference to the all in one here.

Good catch, there is another one in the operatorhub tooling. Will fix!

Copy link
Contributor

@barkbay barkbay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! 🚀

docs/advanced-topics/service-meshes.asciidoc Outdated Show resolved Hide resolved
hack/upgrade-test-harness/README.md Show resolved Hide resolved
hack/operatorhub/main.go Outdated Show resolved Hide resolved
hack/operatorhub/main.go Outdated Show resolved Hide resolved
@pebrc
Copy link
Collaborator Author

pebrc commented Jun 14, 2021

run full pr build

Copy link
Contributor

@thbkrkr thbkrkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work 👏
I left some minor and cosmetic changes.

Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
Makefile Outdated Show resolved Hide resolved
test/e2e/cmd/run/run.go Show resolved Hide resolved
docs/operating-eck/installing-eck.asciidoc Outdated Show resolved Hide resolved
docs/operating-eck/upgrading-eck.asciidoc Show resolved Hide resolved
docs/operating-eck/upgrading-eck.asciidoc Outdated Show resolved Hide resolved
docs/operating-eck/upgrading-eck.asciidoc Show resolved Hide resolved
@@ -25,3 +25,6 @@
path: /spec/validation/openAPIV3Schema/properties/spec/properties/deployment/properties/strategy/properties/rollingUpdate/properties/maxUnavailable/x-kubernetes-int-or-string
- op: remove
path: /spec/validation/openAPIV3Schema/properties/spec/properties/daemonSet/properties/updateStrategy/properties/rollingUpdate/properties/maxUnavailable/x-kubernetes-int-or-string
# we need to generate x-kubernetes-preserve-unknown-fields for v1 CRDs but they break v1beta so we have to remove them again here
- op: remove
path: /spec/validation/openAPIV3Schema/properties/spec/properties/config/x-kubernetes-preserve-unknown-fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normal that we don't have the same count of x-kubernetes-preserve-unknown-fields and // +kubebuilder:pruning:PreserveUnknownFields?

Agent and Beat do not get:

- op: remove
  path: /spec/validation/openAPIV3Schema/properties/spec/properties/daemonSet/properties/podTemplate/x-kubernetes-preserve-unknown-fields
- op: remove
  path: /spec/validation/openAPIV3Schema/properties/spec/properties/deployment/properties/podTemplate/x-kubernetes-preserve-unknown-fields

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's because we remove the whole podTemplate section for the v1beta1 version on the top of the file.

Copy link
Contributor

@thbkrkr thbkrkr Jun 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course.
That's not important but just for my understanding, we could do the same for the following patches where we also remove the whole podTemplate?

v1beta1/patches/apm-kibana-patches.yaml:  path: /spec/validation/openAPIV3Schema/properties/spec/properties/podTemplate/properties
v1beta1/patches/apm-kibana-patches.yaml:  path: /spec/validation/openAPIV3Schema/properties/spec/properties/podTemplate/x-kubernetes-preserve-unknown-fields

v1beta1/patches/elasticsearch-patches.yaml:  path: /spec/validation/openAPIV3Schema/properties/spec/properties/nodeSets/items/properties/podTemplate/properties
v1beta1/patches/elasticsearch-patches.yaml:  path: /spec/validation/openAPIV3Schema/properties/spec/properties/nodeSets/items/properties/podTemplate/x-kubernetes-preserve-unknown-fields

v1beta1/patches/ems-patches.yaml:  path: /spec/validation/openAPIV3Schema/properties/spec/properties/podTemplate/properties
v1beta1/patches/ems-patches.yaml:  path: /spec/validation/openAPIV3Schema/properties/spec/properties/podTemplate/x-kubernetes-preserve-unknown-fields

v1beta1/patches/ent-patches.yaml:  path: /spec/validation/openAPIV3Schema/properties/spec/properties/podTemplate/properties
v1beta1/patches/ent-patches.yaml:  path: /spec/validation/openAPIV3Schema/properties/spec/properties/podTemplate/x-kubernetes-preserve-unknown-fields

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way we remove the podTemplate differs a little bit between Agent and Beats and the other CRDs. Agent and Beats remove the whole podTemplate tree, while the other CRDs only remove the properties from the podTemplate section. This makes the other tweaking necessary. This is nothing this PR introduced but has been in place since the introduction of Beats and Agent. I kept it that way so that the legacy CRDs stay as close to the versions we currently ship as possible.

Copy link
Contributor

@alaudazzi alaudazzi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor suggestions for the text, otherwise LGTM.

docs/operating-eck/upgrading-eck.asciidoc Outdated Show resolved Hide resolved
docs/operating-eck/webhook.asciidoc Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants