Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform state saved on errored runs #881

Closed
laupow opened this issue Jun 8, 2022 · 3 comments
Closed

Terraform state saved on errored runs #881

laupow opened this issue Jun 8, 2022 · 3 comments
Labels

Comments

@laupow
Copy link

laupow commented Jun 8, 2022

I have observed the Terraform Helm provider incorrectly saving state apply runs that fail due to authorization issues

When the Kubernetes API server rejects requests due to expired auth tokens, the Terraform run fails. This is fine and expected; it happens when there is >15 minutes between plan and apply usually.

However, on these failed runs, the terraform state still gets updated with as if there was no failure. That means I can't reapply to fix the issue; I have to revert the changes, apply, then replay the changes to workaround invalid state.

Terraform, Provider, Kubernetes and Helm Versions

Terraform version: 1.1.9
Provider version: 2.5.1
Kubernetes version: 1.21.12-eks-a64ea69

Affected Resource(s)

  • helm_release

Terraform Configuration Files

a snippet:

provider "helm" {
  kubernetes {
    host                   = data.aws_eks_cluster.cluster.endpoint
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
    token                  = data.aws_eks_cluster_auth.cluster.token
  }
}

resource "kubernetes_namespace" "metrics_server" {
  metadata {
    name = "metrics-server"
  }
}

resource "helm_release" "metrics_server" {
  name       = "metrics-server"
  chart      = "metrics-server"
  repository = "https://kubernetes-sigs.github.io/metrics-server/"
  version    = "^3.8.2"
  namespace  = kubernetes_namespace.metrics_server.metadata[0].name

  values = [yamlencode({
    podLabels = {
      "company.com/customer-impact" = "high"
      "company.com/environment"     = var.environment
      "company.com/team"            = "team"
    }
  })]
}

Debug Output

NOTE: In addition to Terraform debugging, please set HELM_DEBUG=1 to enable debugging info from helm.

The job ran in Terraform Cloud. Some output logs:

Terraform v1.1.9
on linux_amd64
Initializing plugins and modules...
helm_release.metrics_server: Modifying... [id=metrics-server]

╷
│ Error: Kubernetes cluster unreachable: the server has asked for the client to provide credentials
│ 
│   with helm_release.metrics_server,
│   on metrics-server.tf line 7, in resource "helm_release" "metrics_server":
│    7: resource "helm_release" "metrics_server" {
│ 
╵

Panic Output

Steps to Reproduce

  1. In Terraform Cloud, do a plan run
  2. Wait 15 minutes for the temporary EKS token to expire
  3. Click Apply
  4. Calls to the Kubernetes API fail as Unauthorized since the token expired
  5. Helm provider fails: "Kubernetes cluster unreachable: the server has asked for the client to provide credentials"
  6. But the Terraform state still gets updated as if there was no failure
  7. Rerun the Terraform plan indicates there are "No changes", but this shouldn't have happened since the API server rejected the requests and they errored

Expected Behavior

No state changes should have persisted

Actual Behavior

State changes persisted on the failed run

Important Factoids

It has happened to me multiple times in the past.

References

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@laupow laupow added the bug label Jun 8, 2022
@laupow
Copy link
Author

laupow commented Jun 8, 2022

Duplicate of #828 it seems

@BBBmau
Copy link
Contributor

BBBmau commented Jun 14, 2022

Thank you for bringing this issue up! We reviewed a recent PR which fixes the issue and have merged it into main. #857

@BBBmau BBBmau closed this as completed Jun 14, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants