Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google_app_engine_flexible_app_version refresh does not update container image #9851

Closed
Deiz opened this issue Aug 18, 2021 · 8 comments · Fixed by GoogleCloudPlatform/magic-modules#5187, hashicorp/terraform-provider-google-beta#3613 or #10058
Assignees
Labels

Comments

@Deiz
Copy link

Deiz commented Aug 18, 2021

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
  • Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment.
  • If an issue is assigned to the modular-magician user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to hashibot, a community member has claimed the issue already.

Terraform Version

Terraform v1.0.4

  • provider registry.terraform.io/hashicorp/google v3.80.0
  • provider registry.terraform.io/hashicorp/google-beta v3.80.0

Affected Resource(s)

  • google_app_engine_flexible_app_version

Terraform Configuration Files

Partial snippet:

resource "google_app_engine_flexible_app_version" "v1" {
  version_id = "v1"
  service    = "default"
  runtime    = "custom"

  deployment {
    container {
      # Latest available image digest, only used at creation time
      image = data.external.appengine_image.result.image
    }
  }

  lifecycle {
    ignore_changes = [
      # Ignored as the service's version is managed elsewhere
      deployment[0].container[0].image
    ]
  }
}

Expected Behavior

During state refresh, the container image should be pulled from the App Engine API.

We use Terraform to create our App Engine Flexible versions, and then use tooling built on top of the App Engine API to update the container image thereafter. Terraform is responsible for definitions (compute resources, environment variables, etc.), our deployment system is responsible for updating the container image.

To that end, we include deployment[0].container[0].image in a lifecycle.ignore_changes block.

Actual Behavior

State refresh does not update deployment[0].container[0].image.

Instead, whenever Terraform detects an in-place update is needed, it rolls the service's image back to the value in the state, which in this case is a 5-month old image.

Steps to Reproduce

  1. Modify the resource to cause an in-place update, e.g. adding an environment variable X = "y"
  2. terraform apply
  # module.<snip>.google_app_engine_flexible_app_version.v1 will be updated in-place
  ~ resource "google_app_engine_flexible_app_version" "v1" {
      ~ env_variables             = {
          + "X"                                   = "y"
            # (15 unchanged elements hidden)
        }
        id                        = "apps/<snip>/services/default/versions/v1"
        name                      = "apps/<snip>/services/default/versions/v1"
        # (9 unchanged attributes hidden)





        # (5 unchanged blocks hidden)
    }
  1. Grab current image, pre-apply (image is from August 2021):
gcloud app versions describe v1 --service=default | grep ' \+image: gcr'
    image: gcr.io/<snip>@sha256:a2a15888f11229f2bd1997139bd085b67dd0fb5d3dc6ae8de3cfc55d4aefb746
  1. Apply plan
  2. Grab current image, post-apply (image reverted to something from April 2021):
gcloud app versions describe v1 --service=default | grep ' \+image: gcr'
    image: gcr.io/<snip>@sha256:bca190e779c8fcc4cd4d89920bd1b5fa51e47021f95dd9d433c48e7777172d26
@Deiz Deiz added the bug label Aug 18, 2021
@edwardmedia edwardmedia self-assigned this Aug 18, 2021
@edwardmedia
Copy link
Contributor

@Deiz help me to understand, on one side, you set ignore_changes on deployment[0].container[0].image. On the other side you want to see the change of deployment[0].container[0].image can be detected and updated? Forgive me if my understanding is incorrect.

@Deiz
Copy link
Author

Deiz commented Aug 19, 2021

Ah, to clarify, what I'm expecting is that while Terraform doesn't care about the value of deployment[0].container[0].image, it will persist the value that was set externally. That's true of most resource types (our organization largely uses Cloud Run, and this is certainly true of the google_cloud_run_service resource.

So what I'd expect out of this google_app_engine_flexible_app_version resource configuration is:

  1. The value of data.external.appengine_image.result.image controls deployment[0].container[0].image at creation time, for illustration's sake let's say it's A
  2. External automation is free to update the container image to something else, e.g. B
  3. When the resource next needs to change, Terraform still needs to update the resource, but I would expect it to use B, because that should be pulled into the state via refresh.

My expectation is that the ignore_changes feature is what tells Terraform to ignore the difference between the value of data.external.appengine_image.result.image and what's in the state, which seems to be consistent with the documentation and other resource types.

There's actually another hint at the bug here which I just noticed, and it looks like this:

  1. Create google_app_engine_flexible_app_version with container image A
  2. Update the App Engine service to image B via the App Engine Admin API
  3. Update the Terraform resource definition to point to contain image C
  4. Remove the ignore_changes block and re-run terraform plan - Terraform will now claim that it's going to update from A to C even though the actual service is verifiably running image B - indicating that the state is not correctly updating on refresh.

@edwardmedia
Copy link
Contributor

edwardmedia commented Aug 19, 2021

@Deiz still try to understand what the issue is but let's discuss below first. Terraform should be able to detect the difference between the state (A or B) and your config (C), and propose & bring the state up to what you ask for (C). This is the expected behavior. Not sure what is wrong in your case.

3. Update the Terraform resource definition to point to contain image C
4. Remove the ignore_changes block and re-run terraform plan - Terraform will now claim that it's going to update from A to C even though the actual service is verifiably running image B - indicating that the state is not correctly updating on refresh.

@Deiz
Copy link
Author

Deiz commented Aug 19, 2021

The issue is that Terraform will display a delta like A -> C when the real-world transition is B -> C. To me, that's the root cause of the ignore_changes misbehaviour - Terraform/TPG doesn't notice that the resource has drifted.

My expectations vs the state:

  1. Resource is created by Terraform with image A
  2. Resource is manually updated to image B outside of Terraform
  3. Edit the Terraform definition to use image C
  4. terraform apply is invoked
  5. In order to generate the plan, Terraform refreshes the resource and updates the state from A to B (this doesn't happen correctly)
  6. Terraform should render a plan of B -> C (also doesn't happen, the displayed plan is A -> C because TPG is failing to notice that the resource's image has changed on GCP)

Also, I think the ignore_changes piece is a red herring and I just reproduced with a simpler example, that hopefully comes across more clearly, note there is no ignore_changes involved:

  1. Resource is created by Terraform with image A
  2. Resource is manually updated to image B outside of Terraform
  3. terraform apply is invoked and outputs No changes. Your infrastructure matches the configuration., this is overtly wrong, and the gcloud app versions describe v1 invocation confirms that image B is active.

In the second example, if TPG were refreshing correctly, it would output a plan to revert B -> A but instead reports no changes.

@edwardmedia
Copy link
Contributor

edwardmedia commented Aug 19, 2021

@Deiz showing A -> C is the plan as the Terraform core controls. Do you see the behavior of B -> C in other resources? @c2thorn what do you think about this issue?

@edwardmedia edwardmedia assigned c2thorn and unassigned edwardmedia Aug 19, 2021
@Deiz
Copy link
Author

Deiz commented Aug 19, 2021

Given that a refresh to align the state with reality occurs before the changes are listed, I've generally only seen this particular behaviour with this resource - all others will override the state with the real-world status of the resource before attempting to create a change set.

@c2thorn c2thorn assigned megan07 and unassigned c2thorn Aug 23, 2021
@megan07
Copy link
Contributor

megan07 commented Aug 25, 2021

I'm actively working on this, but want to put a note in here in case someone else picks it up.
The reason this is happening is because we ignore_read on deployment. It seems to me the reason we do that is because the API returns an empty object for deployment in some cases. I'm seeing this in the tests when we're using files to deploy. Then on update, despite the fact we're using files, the API responds with a container.

I'm in the middle of a fix that takes the ignore_read off, sets deployment.container to Optional/Computed, meaning it will default to whatever the API returns, and then (this is WIP) suppress the diff to change it from deployment.container to deployment.files. In order to do this I need to better understand in which circumstances deployment.container comes back populated and figure out what that diff suppress function needs to look like.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.