Updates to dataproc cluster num workers should have option to use graceful decomissioning. #3999

jaketf · 2019-07-10T16:55:38Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment. If the issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If the issue is assigned to a user, that user is claiming responsibility for the issue. If the issue is assigned to "hashibot", a community member has claimed the issue already.

Description

When scaling down a dataproc cluster w/ terraform by updating the number of worker nodes, currently the instances are immediately terminated. If there is work being done on these nodes that work is lost and may lead to job failures. Dataproc API has an option for graceful decomissioning. There is a gracefulDecomissionTimeout parameter that should be exposed as part of the resource and when updating the number of nodes (this is only really important when scaling down).

Right now to safely scale down an cluster w/ some jobs running you have to manually make an update request (thus diverging from your TF state which is obviously problematic for the next TF apply).

As we add support terraform for Dataproc AutoScaling Policies, where one can also set a graceful decomissioning timeout this creates a potential for a confusing interface. For non-autoscaling clusters there is a strong need for this as a top level property. However, it should be documented that when each would be respected. For example the top level graceful decomissioning timeout should apply to actions taken by terraform due to an update to the number of workers. The timeout listed w/ in the autoscaling policy would dictate the timeout to be used when the dataproc service autoscales your cluster.

New or Affected Resource(s)

google_dataproc_cluster

Potential Terraform Configuration

# Propose what you think the configuration to take advantage of this feature should look like.
# We may not use it verbatim, but it's helpful in understanding your intent.
resource "google_dataproc_cluster" "simplecluster" {
    ...
    # This should be used whenever issuing requests to scale down the cluster defined in this resource.
    graceful_decomission_timeout = "90m"
    ...
}

References

#0000

The text was updated successfully, but these errors were encountered:

ghost · 2020-11-09T13:50:49Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

ghost added the enhancement label Jul 10, 2019

This was referenced Jul 26, 2019

[Issue #3999] add graceful decommissioning timeout to dataproc resource #4122

Closed

Update resource_dataproc_cluster.go.erb GoogleCloudPlatform/magic-modules#2110

Closed

danawillow assigned danawillow and hashibot and unassigned danawillow Aug 13, 2019

paddycarver added the size/s label Dec 6, 2019

paddycarver added this to the Goals milestone Dec 6, 2019

paultyng unassigned hashibot Aug 6, 2020

c2thorn assigned ScottSuarez Sep 9, 2020

ScottSuarez mentioned this issue Oct 9, 2020

Added field graceful_decomissioning_timeout to resource dataproc_cluster GoogleCloudPlatform/magic-modules#4078

Merged

5 tasks

ScottSuarez closed this as completed in GoogleCloudPlatform/magic-modules#4078 Oct 9, 2020

This was referenced Oct 9, 2020

Added field graceful_decomissioning_timeout to resource dataproc_cluster hashicorp/terraform-provider-google-beta#2571

Merged

Added field graceful_decomissioning_timeout to resource dataproc_cluster #7485

Merged

ghost locked as resolved and limited conversation to collaborators Nov 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updates to dataproc cluster num workers should have option to use graceful decomissioning. #3999

Updates to dataproc cluster num workers should have option to use graceful decomissioning. #3999

jaketf commented Jul 10, 2019

ghost commented Nov 9, 2020

Updates to dataproc cluster num workers should have option to use graceful decomissioning. #3999

Updates to dataproc cluster num workers should have option to use graceful decomissioning. #3999

Comments

jaketf commented Jul 10, 2019

Community Note

Description

New or Affected Resource(s)

Potential Terraform Configuration

References

ghost commented Nov 9, 2020