-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If an instance group managed by a GKE node pool is manually removed, terraform plan fails to refresh #4314
If an instance group managed by a GKE node pool is manually removed, terraform plan fails to refresh #4314
Comments
Is there a workaround available for this? I've tried to remove the I've looked at my terraform state as well and cannot find evidence of e.g. |
The IGM URL that it's looking up comes directly from the cluster or node pool response from the GCP API, so it's surprising that the groups are still there if that's removed. Do either of you have debug logs (env var TF_LOG=DEBUG) that you could share from a run that had this error? Those will have the api requests/responses sent to/from GCP, and should have more data on where those URLs are coming from. |
@jacobstr This seems to be a different issue. |
Agreed, that one seems to be regarding node taints. If it's causing issues still then I'd recommend filing a new issue with the bug template fully filled out (config, plan output, debug logs), keeping in mind that any fixes that we make on our side (if it's determined to be a bug in the provider) will likely occur in the 3.X series of the provider. For this one specifically, I'm going to go ahead and put the waiting-response label back on until we're able to see debug logs from a repro. |
Deleted my earlier comment - indeed a different issue. |
So - I was able to reproduce this by modifying the node locations on an existing The output below is fairly butchered from the original output due to attempts at obscuring/redacting details. I hope I've left the substantiative points intact.
I'm all 👂.
|
We're seeing similar behavior on a cluster where we enabled Batch on GKE, which dynamically creates and destroys instance groups. That's a lot more concerning, since it implies that we can't use Terraform to manage the node pools of a cluster with Batch enabled. |
Thanks everyone for your patience; coming back to this now. One way I could fix this is to just not include any instance group url that 404s in the If I did that, does that sound like something that would be sufficient? |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks! |
Community Note
Terraform Version
Affected Resource(s)
google_container_node_pool
Terraform Configuration Files
Basic GKE cluster, such as described here: https://www.terraform.io/docs/providers/google/r/container_cluster.html
Expected Behavior
If an instance group managed by node pool is removed, node pool should be replaced or healed in some other way.
Actual Behavior
terraform plan
displaysError: Error reading instance group manager returned as an instance group URL: "googleapi: Error 404: The resource 'projects/$proj/zones/europe-west1-b/instanceGroupManagers/gke-$cluster-default-af15db45-grp' was not found, notFound"
and exits with non-zero code.Steps to Reproduce
terraform plan
The error does not go away until the node pool is removed manually.
The text was updated successfully, but these errors were encountered: