Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statefile Not updating for [GCE_STOCKOUT] errors #6287

Closed
Assignees
Labels

Comments

@goobysnack
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

v0.12.24

Affected Resource(s)

resource "google_container_node_pool"

both google and google-beta 2.20.3

Terraform Configuration Files

Standard GKE nodepool build. Configuration isn't relevant.

Expected Behavior

If a [GCE_STOCKOUT] error occurs either:

  • don't build the node pool at all and indicate such (but don't halt on error, just communicate the stockout so an apply can run again

  • build the node pool and create a statefile entry with the zone with the stockout as pending

Actual Behavior

Error: Error waiting for creating GKE NodePool: Google Compute Engine: Not all instances running in IGM after 1m12.185384677s. Expect 1. Current errors: [GCE_STOCKOUT]: Instance 'gke-abcdefg-nonprod-cluste-pool-1-g3e954nm-foed' creation failed: The zone 'projects/<removed>/zones/us-central1-b' does not have enough resources available to fulfill the request.  '(resource type:compute)'.

  on .terraform/modules/gcp_gkenodepool_1/main.tf line 16, in resource "google_container_node_pool" "pool":
  16: resource "google_container_node_pool" "pool" {

The above is the error output, which is fine. The issue is that the nodepool is still created, but with a pending zone in stock out....and NO statefile entry is created. So, there is a catch 22. You can't simply re-run apply without deleting the entire node pool. It seems that if 2 of 3 zones are successful, the entire nodepool creation and statefile entry shouldn't be abandoned.

  1. terraform apply
@ghost ghost added the bug label May 4, 2020
@venkykuberan venkykuberan self-assigned this May 4, 2020
@goobysnack
Copy link
Author

FYI eventually, the node pool completes creating instances, but the next time plan/apply are run, without the entry in the statefile, it wants to create what already exists.

@venkykuberan
Copy link
Contributor

venkykuberan commented May 5, 2020

is it an one time event or you are seeing it consistently ?

@goobysnack
Copy link
Author

is it an one time event or you are seeing it consistently ?

Everytime there is a stockout, it fails to create or update the statefile.

@ghost
Copy link

ghost commented Jun 5, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Jun 5, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.