Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perpetual diff on Deployment and crash on apply for OCI chart with manifest experiment enabled #1402

Open
Roberdvs opened this issue Jul 12, 2024 · 2 comments
Labels

Comments

@Roberdvs
Copy link

Roberdvs commented Jul 12, 2024

Terraform, Provider, Kubernetes and Helm Versions

Terraform version: 1.5.3
Provider version: >= 2.13.0
Kubernetes version: EKS 1.29

Affected Resource(s)

  • helm_release

Terraform Configuration Files

provider "helm" {
  experiments {
    manifest = true
  }
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority[0].data)
    exec {
      api_version = "client.authentication.k8s.io/v1beta1"
      command     = "aws"
      args        = ["eks", "get-token", "--cluster-name", var.cluster_name, "--region", var.aws_region, "--role-arn", var.aws_role_arn]
    }
  }
}

resource "helm_release" "release" {

  name       = "kubecost"
  chart      = "cost-analyzer"
  repository = "oci://public.ecr.aws/kubecost"
  version          = "2.3.2"
  namespace        = "kubecost"
  create_namespace = true
}

Debug Output

NOTE: In addition to Terraform debugging, please set HELM_DEBUG=1 to enable debugging info from helm.

╷
│ Error: Provider produced inconsistent final plan
│ 
│ When expanding the plan for module.kubecost.helm_release.release to include
│ new values learned so far during apply, provider
│ "registry.terraform.io/hashicorp/helm" produced an invalid new value for
│ .manifest: was
...
OMITTED, too long to paste here
...
│ 
│ This is a bug in the provider, which should be reported in the provider's
│ own issue tracker.

Panic Output

Steps to Reproduce

I have seen this with Kubecost OCI chart, but might be reproducible with others

  1. Enable manifest diff experiment in the provider
  2. Run terraform apply

If the resource already exists, it shows this perpetual diff on every plan

image

and then crashes on apply with the error above

References

This got released on 2.13 and is probably related:

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@Roberdvs Roberdvs added the bug label Jul 12, 2024
@appilon appilon removed their assignment Jul 12, 2024
@linuxdaemon
Copy link

linuxdaemon commented Jul 24, 2024

EDIT: Never mind, this label is part of my specific chart's template, it is just using .Release.Revision and helm computes the templates with the revision incremented for diffs.

I can confirm this is the case on 2.14.0, terraform 1.9.2, though I'm seeing a diff on the helm-revision label. Definitely seems to be affecting OCI charts, specifically my resource definition was:

resource "helm_release" "volsync" {
  create_namespace = true
  atomic           = true
  cleanup_on_fail  = true

  chart     = "oci://tccr.io/truecharts/volsync"
  name      = "volsync"
  namespace = "volsync"

  version = "2.2.0"

  values = [
    yamlencode({
      metrics = {
        main = {
          enabled = false
        }
      }
    })
  ]

  depends_on = [helm_release.snapshot_controller]
}

Looking through the terraform state, it seems that on OCI charts the state with manifests stores the helm-revision label for each resource, but non-OCI charts seem not to store it, which I assume leads to the state desync.

Actually, I just tested this with the helm CLI and it seems that the issue stems from there, in a dry-run upgrade of the OCI chart, resources all have the helm-revision label (incremented from the live value by one), but the non-OCI chart does not show the helm-revision label at all. So this may actually be a core helm issue

@wondersd
Copy link

Ive noticed this too in other truechart helm charts. They all seem to use this pattern of embedding the .Release.Revision as a pod label, making none of them compatible with manifest. I've also come across this type of pattern in other charts as well, seems to be a not terribly uncommon practice or perhaps one growing in popularity.

the kubernetes_manifest provider can have similar problems when it comes to things like labels being injected onto resources post creation and has a work around with computed fields

Perhaps the same tact could be taken here where a yq style path can be used to denote fields that will always be different so they can be ignored for purposes of computing differences.

resource "helm_release" "this" {
  ...
  computed_fields = [
    "kubecost/deployment.apps/apps/v1/kubecost-cost-analyzer.spec.template.metadata.labels.helm-rollout-restarter"
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants