v3.87.0 introduced a delete issue with the google_compute_resource_policy resource #10385

suckowbiz · 2021-10-25T11:24:00Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
If you are interested in working on this issue or have submitted a pull request, please leave a comment.
If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

# terraform -v
Terraform v1.0.9
on linux_amd64

Affected Resource(s)

google_compute_resource_policy

Terraform Configuration Files

# versions.tf

terraform {
  required_version = ">= 1.0.1, < 2.0.0"
  required_providers {
    google = {
      source = "hashicorp/google"
      version = "3.87.0"
    }
    google-beta = {
      source  = "hashicorp/google-beta"
      version = "3.87.0"
    }
    local = {
      source  = "hashicorp/local"
      version = "2.1.0"
    }
    null = {
      source  = "hashicorp/null"
      version = "3.1.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "3.1.0"
    }
    external = {
      source  = "hashicorp/external"
      version = "2.1.0"
    }
  }
}

# main.tf
resource "google_compute_instance" "<removed>" {
        id                        = "<removed>"
        name                      = "test-vm"
      ~ resource_policies         = [
          - "https://www.googleapis.com/compute/beta/projects/<removed>/resource-policy01",
        ]
        tags                      = []
        # (18 unchanged attributes hidden)
        # (4 unchanged blocks hidden)
    }

# module.generator.google_compute_resource_policy.<removed> will be destroyed
  - resource "google_compute_resource_policy" "<removed>" {
      - description = "terraform managed" -> null
      - id          = "projects/<removed>/resource-policy01" -> null
      - name        = "resource-policy01" -> null
      - project     = "<removed>" -> null
      - region      = "https://www.googleapis.com/compute/v1/projects/<removed>/regions/europe-west3" -> null
      - self_link   = "https://www.googleapis.com/compute/v1/projects/<removed>/regions/europe-west3/resourcePolicies/resource-policy01" -> null
      - instance_schedule_policy {
          - time_zone = "Europe/Berlin" -> null
          - vm_start_schedule {
              - schedule = "0 8 * * *" -> null
            }
          - vm_stop_schedule {
              - schedule = "0 18 * * *" -> null
            }
        }
    }

Debug Output

https://gist.github.com/suckowbiz/40258a28d25ab3ee681361814116f158

Expected Behavior

It should remain possible to deconfigure a resource policy from a compute instance AND delete that resource policy in the same terraform apply step.

Actual Behavior

Terraform fails with:

 Error: Error when reading or editing ResourcePolicy: googleapi: Error 400: The resource_policy resource 'projects/GGG-vm02jae2fteumia1ep9radz10/regions/europe-west3/resourcePolicies/resource-policy01' is already being used by 'projects/GGG-vm02jae2fteumia1ep9radz10/zones/europe-west3-a/instances/test-vm', resourceInUseByAnotherResource

Steps to Reproduce

Attach a resource policy to a compute instance
Remove the resource policy from the compute instance AND remove the resource policy definition in the same step

Important Factoids

I have tested with Google Povider v3.87.0, v3.88.0 and v3.89.0. The issue remains!
I rolled back to 3.86.0. It works as expected without any issue.

References

The release of 3.87.0 (https://github.com/hashicorp/terraform-provider-google/releases/tag/v3.87.0) indeed introduced changes with the resource_policy here Don't recreate GCE instances when updating resource_policies property #10173

b/308756190

The text was updated successfully, but these errors were encountered:

edwardmedia · 2021-10-25T21:50:00Z

@suckowbiz can you try adding lifecycle.create_before_destroy in your instance?

suckowbiz · 2021-10-26T05:49:10Z

@suckowbiz can you try adding lifecycle.create_before_destroy in your instance?

Thanks @edwardmedia. This is a good suggestion. I will try that.

IMHO:
As I verified that handling of the replacement object was appropriately done before. Now since 3.87.0 the proper behavior has to be restored by adding lifecycle configuration? This feels wrong.

Shouldn't it be the case that the provider takes care of replacement object handling in an appropriate way without the need of additional configuration?

suckowbiz · 2021-10-26T11:21:47Z

@edwardmedia

I tested with the suggested lifecycle rule. Unfortunately adding:

  lifecycle {
    create_before_destroy = true
  }

to google_compute_instance leads to the situation, that when a change to the instance is planned that forces a replacement of the instance (e.g. changing hostname), then this fails with:

 Error: Error creating instance: googleapi: Error 409: The resource 'projects/<removed>/zones/europe-west3-a/instances/test-vm' already exists, alreadyExists

So this is not a generic workaround.

edwardmedia · 2021-10-26T19:22:14Z

@suckowbiz the fix seems appropriate. We don't want to recreate the instance if its policy is updated. But that change appears to cause reorder of the execution sequence ( in your case among two resources ) which is controlled by Terraform core. Adding lifecycle is the method Terraform provides to workaround the issues like this.
Two things you might consider

use the lifecycle. I tested the config and can't repro the error you encountered Would you please share your full debug log so I can take a closer look?
do you have to deconfigure a resource policy from a compute instance AND delete that resource policy in the same terraform apply step? Is that possible you can separate them to two apply?

suckowbiz · 2021-10-28T10:49:00Z

@edwardmedia

Two things you might consider

use the lifecycle. I tested the config and can't repro the error you encountered Would you please share your full debug log so I can take a closer look?

Please find the debug output here:
https://gist.github.com/suckowbiz/f8a8a36c03275f885124401eb8180858

The issue (with the lifecycle workaround in place) occurred when changing an attribute of the vm that enforces re-creation of the vm (hostname). With having the above mentioned lifecycle rule in place this fails. A solution could be to add the lifecycle rule conditionally but we do not want to dive into the complexity of when to add it and when not.

do you have to deconfigure a resource policy from a compute instance AND delete that resource policy in the same terraform apply step?

Yes, exactly.

Is that possible you can separate them to two apply?

This would be possible but removes all the magic of Terraform. How can we explain the DevOps colleagues that Terraform cannot handle the calculated plan and they must apply their changes slice-wise.

edwardmedia · 2021-10-28T19:28:19Z

@slevenick What do you think about this issue?

slevenick · 2021-10-29T17:55:20Z

Unfortunately I don't think we have any options here. From my understanding this used to work prior to 3.87.0 because the force replacement allowed the dependency to be removed before the resource policy would be deleted. The new behavior is more correct, but results in this common issue.

Similar issues have been run into here and here

The upstream issue in Terraform core is still open, and we don't have a lot of options within the provider to handle this.

You may be able to fix the create_before_destroy issue you're seeing by adding an element of randomness to the instance's name so that you can successfully create the new instance before destroying the old one. You can do this by using the random provider to add a random suffix to the name of the instance within your Terraform config.

To reproduce the previous behavior you may be able to work around this by forcing the instance to recreate when there are changes to the resource policy. I could imagine handling this by using the resource policy's name in a field on the instance that forces recreate when it is changed but that isn't important to you. For example the metadata_startup_script. This adds some complexity, but may allow you to reproduce the previous behavior

suckowbiz · 2021-11-08T07:09:52Z

Hi @edwardmedia,

are there plans now to fix the create_before_destroy issue, @slevenick described?

slevenick · 2021-11-08T16:56:21Z

We cannot fix the create_before_destroy issue within the provider. A fix would need to come from Terraform core, but we don't have any indication on when that would happen. I'll mark this as upstream to indicate we cannot fix this within the provider

…ult) (hashicorp#10385) [upstream:269a04f051a4ad1b9ec610e81a8996dd14e0cfaa] Signed-off-by: Modular Magician <[email protected]>

…ult) (#10385) (#17793) [upstream:269a04f051a4ad1b9ec610e81a8996dd14e0cfaa] Signed-off-by: Modular Magician <[email protected]>

suckowbiz added the bug label Oct 25, 2021

suckowbiz changed the title ~~v3.87.0 introduced an delete issue with the resource_policy resource~~ v3.87.0 introduced an delete issue with the google_compute_resource_policy resource Oct 25, 2021

suckowbiz changed the title ~~v3.87.0 introduced an delete issue with the google_compute_resource_policy resource~~ v3.87.0 introduced a delete issue with the google_compute_resource_policy resource Oct 25, 2021

edwardmedia self-assigned this Oct 25, 2021

edwardmedia added the waiting-response label Oct 25, 2021

github-actions bot removed the waiting-response label Oct 26, 2021

edwardmedia added the waiting-response label Oct 26, 2021

github-actions bot removed the waiting-response label Oct 26, 2021

edwardmedia added the waiting-response label Oct 26, 2021

github-actions bot removed the waiting-response label Oct 28, 2021

edwardmedia assigned slevenick and unassigned edwardmedia Oct 28, 2021

slevenick added the waiting-response label Oct 29, 2021

edwardmedia mentioned this issue Nov 1, 2021

Deleting instance template while updating template of an Instance group manager in the same apply #10365

Open

github-actions bot removed the waiting-response label Nov 8, 2021

slevenick added the upstream-terraform label Nov 8, 2021

slevenick removed their assignment Nov 8, 2021

github-actions bot added forward/review In review; remove label to forward service/compute-instances labels Oct 25, 2023

edwardmedia removed the forward/review In review; remove label to forward label Oct 26, 2023

modular-magician added the forward/linked label Nov 1, 2023

modular-magician added a commit that referenced this issue Apr 8, 2024

Increased google_compute_security_policy timeouts to 20 minutes (defa…

64147a7

…ult) (#10385) (#17793) [upstream:269a04f051a4ad1b9ec610e81a8996dd14e0cfaa] Signed-off-by: Modular Magician <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3.87.0 introduced a delete issue with the google_compute_resource_policy resource #10385

v3.87.0 introduced a delete issue with the google_compute_resource_policy resource #10385

suckowbiz commented Oct 25, 2021 •

edited by modular-magician

Loading

edwardmedia commented Oct 25, 2021

suckowbiz commented Oct 26, 2021

suckowbiz commented Oct 26, 2021 •

edited

Loading

edwardmedia commented Oct 26, 2021

suckowbiz commented Oct 28, 2021 •

edited

Loading

edwardmedia commented Oct 28, 2021 •

edited

Loading

slevenick commented Oct 29, 2021

suckowbiz commented Nov 8, 2021

slevenick commented Nov 8, 2021

v3.87.0 introduced a delete issue with the google_compute_resource_policy resource #10385

v3.87.0 introduced a delete issue with the google_compute_resource_policy resource #10385

Comments

suckowbiz commented Oct 25, 2021 • edited by modular-magician Loading

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

edwardmedia commented Oct 25, 2021

suckowbiz commented Oct 26, 2021

suckowbiz commented Oct 26, 2021 • edited Loading

edwardmedia commented Oct 26, 2021

suckowbiz commented Oct 28, 2021 • edited Loading

edwardmedia commented Oct 28, 2021 • edited Loading

slevenick commented Oct 29, 2021

suckowbiz commented Nov 8, 2021

slevenick commented Nov 8, 2021

suckowbiz commented Oct 25, 2021 •

edited by modular-magician

Loading

suckowbiz commented Oct 26, 2021 •

edited

Loading

suckowbiz commented Oct 28, 2021 •

edited

Loading

edwardmedia commented Oct 28, 2021 •

edited

Loading