-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make it harder to destroy resources #24658
Labels
Comments
An additional option here for letting the providers handle it: resources could specify their destroy behavior as part of schema (ie. default to prevention or allowing), basically controlling a default for the lifecycle flag, but preserving the existing lifecycle syntax for overriding it. |
I like @danawillow CLI Flag... default behavior should never destroy anything except stated explicitly |
This was referenced Apr 13, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
(I'm sorry for not filling out the template exactly, but my thoughts didn't really follow that format).
Destroying resources is too easy
tl;dr: Terraform makes it easy to destroy infrastructure and hard to ensure it sticks around. For production systems, the reverse would be preferable.
Background
Each field in a resource has two options to describe what to do if it changes:
ForceNew: false
if it can be updated in place (usually by calling a PUT or PATCH method on the resource API)ForceNew: true
if it cannot be updated in placeIf the field cannot be updated in place (it is marked
ForceNew
), any change to it will destroy and recreate the resource.Resources are also destroyed if they are removed from the config but still present in the state file.
Problems
This can fail users in several ways:
This behavior becomes devastating in cases where Terraform is being run in a loop for continuous actuation. Although the official recommendation is to never run Terraform without looking at a plan, there are many cases why someone might want to have an auto-approve loop, with plan only being run when a user actually makes a change to their config. For example, users may want to ensure they're detecting and managing drift quickly, and waiting for the next terraform plan to run wouldn't be fast enough.
All of these situations can be avoided when people use Terraform perfectly, making sure they look carefully at all plans before they're applied, and that they understand the correct order to perform non-plan/apply actions (like import). Regardless of the exact reasons why people end up in these situations, though, we've heard from our customers that they'd like more safeguards around accidentally destroying infrastructure, especially pieces that should be long-lived, like databases.
Potential Solutions
The following are a few ideas I have (ones I like and ones I don't), mainly put out here for discussion and consolidation.
Encourage lifecycle.prevent_destroy on everything
This is basically where we're at right now, but it's not great. It requires users to know about the feature at all, and to put it on every single resource (since there's no way to specify it to default to true globally). It has several flaws in its current implementation that make it hard and/or dangerous to use (#16392, #3874, #17599).
Let providers handle it
There's a few options for this:
ForceNew: true
from fields that can't be updated, and addCustomizeDiff
behavior to the resource to fail if there is a diff on those fields. This requires no changes to Terraform core/SDK, but requires there to be a special customizediff on basically every resource in the provider. It also means that different providers will behave inconsistently, which isn't a great UX.Deprecate lifecycle.prevent_destroy, make that the default behavior, and add lifecycle.allow_recreate
This would make deleting resources something that can only happen by removing it from the config or running
terraform destroy
, or by opting into an allow_recreate. It would be a breaking change, so have to be rolled out carefully, but it adds safeguards.This makes it much more explicit that removing a resource from your config will destroy it. It doesn't prevent the human error case of running terraform apply when the resource isn't in the config but is in state, but it at least prevents unintentional destroy/recreates.
This also could be extended to have several different destroy options: in addition to allow_recreate, users could also have abandon (destroy removes from state but does not call Destroy()) or a stricter prevent_all_destroy (which wouldn't allow a destroy/create, a removal from state, or a
terraform destroy
to succeed).Add the option to specify default destroy behavior in a
terraform{ }
orprovider{ }
blockThis lets users specify what they want their default to be across all resources in their configs, so it doesn't need to be specified on each one. This could be combined with the previous solution.
Add a command-line flag
--allow-destroy
This would make plans that include destroying a resource fail if the flag is not set. This is similar to making prevent_destroy the default, but with making the destroy allowed on a run of terraform rather than as a property on the resource. This makes it much more explicit what the behavior is when removing a resource from config, and you don't have to worry about comparing with a previous version of state. Users would have to adjust any CI systems to adapt to the new flag, but other than that wouldn't have to make any config changes. However, it may be tedious for users that need to frequently, but not always, replace resources that cannot be updated in-place.
The text was updated successfully, but these errors were encountered: