Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp controller manager integration #14

Closed
faisal-memon opened this issue Oct 3, 2023 · 12 comments
Closed

Revamp controller manager integration #14

faisal-memon opened this issue Oct 3, 2023 · 12 comments

Comments

@faisal-memon
Copy link
Collaborator

The current controller manager integration has a number of issues:

  • The CRD and CRs are deployed together. This is problematic for CI/CD workflows.
  • Other projects have the CRD as a seperate chart, installed separately
  • Helm won't automatically upgrade if the controller-manager CRD changes. This could be possibly related to not versioning the controller manager CRD when it changes.

Original issue: spiffe/helm-charts#427

@edwbuck
Copy link
Collaborator

edwbuck commented Oct 3, 2023

Ideally, the CRD should be versioned such that both version of the CRD can be deployed at the same time.

This permits one to have the old CRs lingering in the system while they are being migrated to the new CRD format, if such an upgrade support is needed.

A plan like this will require coordination with the SPIRE Controller Manager in the SPIRE project so the Controller Manager will check for all the supported CRD versions and shutdown if they are not present, and when they are present, will listen to all of the supported CRD versions.

The primary reason that the CRD is not being updated according to new versions of it is likely due to the Object Identifier ( spec.versions / metadata.name / metadata.namespace ) not changing when the CRD's spec changes. As Helm checks to see if the CRD was previously deployed, it finds a CRD with the same Object Identifier present, so it determines that an upgrade is not necessary. In our scenario the old / new versions both have the same Object Identifier, but the spec sections change. I do not believe that Helm is performing a deep comparison between the two items, as the spec section quickly wraps custom object (CR) definitions, which is where the relevant changes we want to apply are located.

@kfox1111
Copy link
Collaborator

kfox1111 commented Oct 3, 2023

When using helm3 with the crds dir, helm only looks to see if the crd's metadata.name in each object in that dir already exists, and if not, applies that crd to the cluster. Very basic logic. So any update to the objects wont be applied at all. This was by design. Details at https://helm.sh/docs/chart_best_practices/custom_resource_definitions/#some-caveats-and-explanations

@kfox1111
Copy link
Collaborator

kfox1111 commented Oct 4, 2023

The easiest solution would be to move the crds out to their own chart named something like spire-crds and put them in its template directory. Everyone would install/upgrade the crds chart first, then the main spire one. (Or use the raw crds manifests instead of the crds chart).

If we wanted to still have it work as just one chart to install/upgrade there are a couple options but both require helm hooks to function. so likely we'd still need the option of some users deploying two charts if they couldn't run helm hooks.

The two main variants would be:

  1. helm pre hooks for installing/upgrading the crds. There are a lot of data to the crds so would be a little bit difficult to mash them in via a hook. opa gatekeeper builds a hook like this and has a special image with the crds baked into it along with kubectl. A drawback to that approach is as an end user you cant see what crds its going to upload very easily at all. I really like to helm diff and see what an upgrade is going to do.

  2. helm post hook for installing/updating the ClusterSPIFFEID. We only have one of them, its a small object, and it very rarely changes. It also could simplify the other helm hooks as there are multiple to try and work around problems with the ClusterSPIFFEID not being usable until the spire-server is functional. With a job, it could init container wait-for-it on the spire-server coming up before kubectl applying the ClusterSPIFFEID removing the need to mess with webhook configurations at all. This might require a spire-clusterspiffeid chart instead of a spire-crds chart for those not wanting to run helm hooks.

@marcofranssen
Copy link
Collaborator

marcofranssen commented Oct 4, 2023

3 possible options

  1. Keep chart as is, install crds via kubectl apply and chart with installCrds: false
  2. Move CRDs to separate Helm chart, Install both Helm charts
  3. Add dependency on CRDs chart with a installCRDs value.
    This will be an extension of 2. that allows for optional management via the Helm Hooks to install CRDs first.

We can approach this in 3 PRs.

  1. Document in README.md and release notes for upgrading to new version
  2. Move CRDs to separate Chart + adjust documentation / release notes
  3. Add the dependency on our spire chart with the optional flag to manage the CRDs

ArgoCD examples:

Option 1 - Prometheus

{{- if .Values.kubePrometheusStack.enabled -}}
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: kube-prometheus-stack-crds
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "2"
    argocd.argoproj.io/manifest-generate-paths: .
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  destination:
    server: {{ .Values.kubePrometheusStack.destination.server }}
    namespace: {{ .Values.kubePrometheusStack.destination.namespace }}
  project: platform-monitoring
  source:
    repoURL: https://github.com/prometheus-community/helm-charts
    targetRevision: kube-prometheus-stack-{{ .Values.kubePrometheusStack.source.targetRevision }}
    path: charts/kube-prometheus-stack/crds/
    directory:
      recurse: true
  syncPolicy:
    syncOptions:
      - CreateNamespace=true
      - Replace=true
    automated:
      prune: true
      selfHeal: true
{{- end -}}
---
{{- if .Values.kubePrometheusStack.enabled -}}
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: kube-prometheus-stack
  namespace: argocd
  annotations:
    argocd.argoproj.io/sync-wave: "3"
    argocd.argoproj.io/manifest-generate-paths: /charts/monitoring
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  destination:
    server: {{ .Values.kubePrometheusStack.destination.server }}
    namespace: {{ .Values.kubePrometheusStack.destination.namespace }}
  project: platform-monitoring
  source:
    repoURL: https://prometheus-community.github.io/helm-charts
    targetRevision: {{ .Values.kubePrometheusStack.source.targetRevision }}
    helm:
      skipCrds: true
      values: {{ .Files.Get "config/kube-prometheus-stack-values.yaml" | toYaml | indent 6 }}
      parameters: []
    chart: kube-prometheus-stack
  syncPolicy:
    syncOptions:
      - CreateNamespace=true
    automated:
      prune: true
      selfHeal: true
  ignoreDifferences:
    - group: admissionregistration.k8s.io
      kind: MutatingWebhookConfiguration
      name: kps-admission
      jqPathExpressions:
        - .webhooks[] | select(.name == "prometheusrulemutate.monitoring.coreos.com") | .failurePolicy
    - group: admissionregistration.k8s.io
      kind: ValidatingWebhookConfiguration
      name: kps-admission
      jqPathExpressions:
        - .webhooks[] | select(.name == "prometheusrulemutate.monitoring.coreos.com") | .failurePolicy
{{- end -}}

Option 1 - Kyverno

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

namespace: kyverno

resources:
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_admissionreports.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_backgroundscanreports.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_cleanuppolicies.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_clusteradmissionreports.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_clusterbackgroundscanreports.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_clustercleanuppolicies.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_clusterpolicies.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_generaterequests.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_policies.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_policyexceptions.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/kyverno.io_updaterequests.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/wgpolicyk8s.io_clusterpolicyreports.yaml
  - https://github.com/kyverno/kyverno/releases/download/v1.9.4/wgpolicyk8s.io_policyreports.yaml

Would be very similar in ArgoCD except you won't recurse the files in the crds folder, but just install the Helm chart as is.

All of these above examples are using a App of Apps pattern in our ArgoCD repo:

  • Prometheus
    • prometheus-crds (sync-wave 1)
    • prometheus-chart (sync-wave 2)
  • Trivy (same approach as Prometheus)
    • trivy-crds (sync-wave 1)
    • trivy-chart (sync-wave 2)
  • Kyverno
    • kyverno-crds (sync wave 1)
      Kyverno does not have the CRDs available plain text in the repo, so we can't use the recurse approach from other examples, instead we download the CRDs from the release assets using Kustomize.
    • kyverno-chart (sync wave 2)

In essence all charts mentioned here are just using option 1, which gives us full control on CRDs without the need of requesting any custom approaches from the chart maintainers. My preferred choice therefore would be Option 1 to not overcomplicate things with all kind of edge cases we have to handle.

@kfox1111
Copy link
Collaborator

kfox1111 commented Oct 4, 2023

@marcofranssen With your option one, we'd require users to kubectl apply the crds before upgrading the helm chart?

@kfox1111
Copy link
Collaborator

kfox1111 commented Oct 4, 2023

@marcofranssen what do you think of my option 2?

@faisal-memon
Copy link
Collaborator Author

Reading through the discussion here brings up another issue for me: the actual clusterspiffeid. Should that be the separate chart? Thats what we interact with more. The CRD itself seems to me to be more in the realm of the statefulset, configmaps, etc. General plumbing we can hide.

Is it possible to keep the CRD in the same chart as a template so it gets updated with everything else? And then split the clusterspiffeid definition off into a separate chart. Then we can version that, add new fields as necessary.

@kfox1111
Copy link
Collaborator

kfox1111 commented Oct 5, 2023

Reading through the discussion here brings up another issue for me: the actual clusterspiffeid. Should that be the separate chart? Thats what we interact with more. The CRD itself seems to me to be more in the realm of the statefulset, configmaps, etc. General plumbing we can hide.

Is it possible to keep the CRD in the same chart as a template so it gets updated with everything else? And then split the clusterspiffeid definition off into a separate chart. Then we can version that, add new fields as necessary.

Yeah. That could be done.

@faisal-memon
Copy link
Collaborator Author

@marcofranssen What do you think of my proposal?

@marcofranssen
Copy link
Collaborator

@marcofranssen What do you think of my proposal?

It sounds as an interesting approach to try. Although I don't fully understand what you are describing there, maybe you can give us a voice over.

@marcofranssen
Copy link
Collaborator

Reading through the discussion here brings up another issue for me: the actual clusterspiffeid. Should that be the separate chart? Thats what we interact with more. The CRD itself seems to me to be more in the realm of the statefulset, configmaps, etc. General plumbing we can hide.

Is it possible to keep the CRD in the same chart as a template so it gets updated with everything else? And then split the clusterspiffeid definition off into a separate chart. Then we can version that, add new fields as necessary.

What would be the idea of versioning this? In the end we try to resolve a single apply time upgrade path x -> y or y -> z. Meaning after the upgrade of the chart has ran any previous versions don't matter anymore. This feels like overcomplicating what we try to achieve or maybe I'm not following completely.

@kfox1111
Copy link
Collaborator

Fixed in #8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants