-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node_pools don't support regional clusters #1300
Comments
@danawillow @ashish-amarnath I tried! |
I'll take this up @Stono |
Thank you oh random stranger @darrenhaken who definitely doesn't work in my office :) |
I'm also encountering this bug. |
This PR also switched us to using the beta API in all cases, and that had a side effect which is worth noting, note included here for posterity. ===== The problem is, we add a GPU, and as per the docs, GKE adds a taint to the node pool saying "don't schedule here unless you tolerate GPUs", which is pretty sensible. Terraform doesn't know about that, because it didn't ask for the taint to be added. So after apply, on refresh, it sees the state of the world (1 taint) and the state of the config (0 taints) and wants to set the world equal to the config. This introduces a diff, which makes the test fail - tests fail if there's a diff after they run. Taints are a beta feature, though. :) And since the config doesn't contain any taints, terraform didn't see any beta features in that node pool ... so it used to send the request to the v1 API. And since the v1 API didn't return anything about taints (since they're a beta feature), terraform happily checked the state of the world (0 taints I know about) vs the config (0 taints), and all was well. This PR makes every node pool refresh request hit the beta API. So now terraform finds out about the taints (which were always there) and the test fails (which it always should have done). The solution is probably to write a little bit of code which suppresses the report of the diff of any taint with value 'nvidia.com/gpu', but only if GPUs are enabled. I think that's something that can be done.
This works now so closing the issue :-) ta |
…#1320) This PR also switched us to using the beta API in all cases, and that had a side effect which is worth noting, note included here for posterity. ===== The problem is, we add a GPU, and as per the docs, GKE adds a taint to the node pool saying "don't schedule here unless you tolerate GPUs", which is pretty sensible. Terraform doesn't know about that, because it didn't ask for the taint to be added. So after apply, on refresh, it sees the state of the world (1 taint) and the state of the config (0 taints) and wants to set the world equal to the config. This introduces a diff, which makes the test fail - tests fail if there's a diff after they run. Taints are a beta feature, though. :) And since the config doesn't contain any taints, terraform didn't see any beta features in that node pool ... so it used to send the request to the v1 API. And since the v1 API didn't return anything about taints (since they're a beta feature), terraform happily checked the state of the world (0 taints I know about) vs the config (0 taints), and all was well. This PR makes every node pool refresh request hit the beta API. So now terraform finds out about the taints (which were always there) and the test fails (which it always should have done). The solution is probably to write a little bit of code which suppresses the report of the diff of any taint with value 'nvidia.com/gpu', but only if GPUs are enabled. I think that's something that can be done.
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks! |
Hey,
We're trying to rebuild our clusters using the new regional clusters added in 1.9.0, however we use
google_container_node_pool
to add custom pools to our cluster.The node pools however do not work against regional clusters:
Proposal
I think the google_container_node_pool resource should be changed with:
Associated issue: #829
Associated PR for master regional clusters: #1181
The text was updated successfully, but these errors were encountered: