Perpetual diff on Deployment and crash on apply for OCI chart with manifest experiment enabled #1402

Roberdvs · 2024-07-12T10:42:55Z

Terraform, Provider, Kubernetes and Helm Versions

Terraform version: 1.5.3
Provider version: >= 2.13.0
Kubernetes version: EKS 1.29

Affected Resource(s)

helm_release

Terraform Configuration Files

provider "helm" {
  experiments {
    manifest = true
  }
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
    exec {
      api_version = "client.authentication.k8s.io/v1beta1"
      command     = "aws"
      args        = ["eks", "get-token", "--cluster-name", var.cluster_name, "--region", var.aws_region, "--role-arn", var.aws_role_arn]
    }
  }
}

resource "helm_release" "release" {

  name       = "kubecost"
  chart      = "cost-analyzer"
  repository = "oci://public.ecr.aws/kubecost"
  version          = "2.3.2"
  namespace        = "kubecost"
  create_namespace = true
}

Debug Output

NOTE: In addition to Terraform debugging, please set HELM_DEBUG=1 to enable debugging info from helm.

╷
│ Error: Provider produced inconsistent final plan
│ 
│ When expanding the plan for module.kubecost.helm_release.release to include
│ new values learned so far during apply, provider
│ "registry.terraform.io/hashicorp/helm" produced an invalid new value for
│ .manifest: was
...
OMITTED, too long to paste here
...
│ 
│ This is a bug in the provider, which should be reported in the provider's
│ own issue tracker.

Panic Output

Steps to Reproduce

I have seen this with Kubecost OCI chart, but might be reproducible with others

Enable manifest diff experiment in the provider
Run terraform apply

If the resource already exists, it shows this perpetual diff on every plan

and then crashes on apply with the error above

References

This got released on 2.13 and is probably related:

Fix manifest diffs for OCI charts #1326

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

linuxdaemon · 2024-07-24T10:50:48Z

EDIT: Never mind, this label is part of my specific chart's template, it is just using .Release.Revision and helm computes the templates with the revision incremented for diffs.

~~I can confirm this is the case on 2.14.0, terraform 1.9.2, though I'm seeing a diff on the helm-revision label. Definitely seems to be affecting OCI charts, specifically my resource definition was:~~

resource "helm_release" "volsync" {
  create_namespace = true
  atomic           = true
  cleanup_on_fail  = true

  chart     = "oci://tccr.io/truecharts/volsync"
  name      = "volsync"
  namespace = "volsync"

  version = "2.2.0"

  values = [
    yamlencode({
      metrics = {
        main = {
          enabled = false
        }
      }
    })
  ]

  depends_on = [helm_release.snapshot_controller]
}

Looking through the terraform state, it seems that on OCI charts the state with manifests stores the helm-revision label for each resource, but non-OCI charts seem not to store it, which I assume leads to the state desync.

Actually, I just tested this with the helm CLI and it seems that the issue stems from there, in a dry-run upgrade of the OCI chart, resources all have the helm-revision label (incremented from the live value by one), but the non-OCI chart does not show the helm-revision label at all. So this may actually be a core helm issue

wondersd · 2024-08-17T01:11:50Z

Ive noticed this too in other truechart helm charts. They all seem to use this pattern of embedding the .Release.Revision as a pod label, making none of them compatible with manifest. I've also come across this type of pattern in other charts as well, seems to be a not terribly uncommon practice or perhaps one growing in popularity.

the kubernetes_manifest provider can have similar problems when it comes to things like labels being injected onto resources post creation and has a work around with computed fields

Perhaps the same tact could be taken here where a yq style path can be used to denote fields that will always be different so they can be ignored for purposes of computing differences.

resource "helm_release" "this" {
  ...
  computed_fields = [
    "kubecost/deployment.apps/apps/v1/kubecost-cost-analyzer.spec.template.metadata.labels.helm-rollout-restarter"
  ]
}

Roberdvs added the bug label Jul 12, 2024

github-actions bot assigned appilon Jul 12, 2024

appilon removed their assignment Jul 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perpetual diff on Deployment and crash on apply for OCI chart with manifest experiment enabled #1402

Perpetual diff on Deployment and crash on apply for OCI chart with manifest experiment enabled #1402

Roberdvs commented Jul 12, 2024 •

edited

Loading

linuxdaemon commented Jul 24, 2024 •

edited

Loading

wondersd commented Aug 17, 2024

Perpetual diff on Deployment and crash on apply for OCI chart with manifest experiment enabled #1402

Perpetual diff on Deployment and crash on apply for OCI chart with manifest experiment enabled #1402

Comments

Roberdvs commented Jul 12, 2024 • edited Loading

Terraform, Provider, Kubernetes and Helm Versions

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Steps to Reproduce

References

Community Note

linuxdaemon commented Jul 24, 2024 • edited Loading

wondersd commented Aug 17, 2024

Roberdvs commented Jul 12, 2024 •

edited

Loading

linuxdaemon commented Jul 24, 2024 •

edited

Loading