Ignore role_entity order in storage_bucket_acl #1525

tmshn · 2018-05-23T16:14:11Z

When you set role_entity in storage_bucket_acl, it will be sorted in the following order: project-owners, project-editors, project-viewers and rest.

Therefore, a diff persists after apply if you don't set role_entity in the right order.

resource "google_storage_bucket_acl" "foo" {
  bucket = "${google_storage_bucket.foo.name}"

  role_entity = [
    "OWNER:user-${google_service_account.bar.email}",
    "READER:project-viewers-${data.google_project.this.number}",
    "OWNER:project-editors-${data.google_project.this.number}",
    "OWNER:user-${google_service_account.baz.email}",
    "OWNER:project-owners-${data.google_project.this.number}",
  ]
}

With this configuration, following diff appears every time even after apply.

  ~ google_storage_bucket_acl.foo
      role_entity.0:  "OWNER:[email protected]" => "OWNER:project-owners-0123456789"
      role_entity.1:  "READER:project-viewers-0123456789" => "OWNER:project-editors-0123456789"
      role_entity.2:  "OWNER:project-editors-0123456789" => "READER:project-viewers-0123456789"
      role_entity.3:  "OWNER:[email protected]" => "OWNER:[email protected]"
      role_entity.4:  "OWNER:project-owners-0123456789" => "OWNER:[email protected]"

However, its order doesn't matter in the sense of access control.

To suppress this annoying diff, I changed the type of role_entity argument from schema.TypeList to schema.TypeSet.

Note: issue #892 looks related, but may be different problem.

danawillow · 2018-05-25T00:27:15Z

I'm going to mark this as v2 for now, since technically this is a breaking change for anybody that might have been relying on indexes of this field. Feel free to tell me that there's no possible way anybody was doing that though; if so I'll take another look.

tmshn · 2018-05-25T02:18:53Z

I'm not sure about this, but I think I didn't change how role_entity is set in the state.

https://github.com/tmshn/terraform-provider-google/blob/gcsacl-entities-unordered/google/resource_storage_bucket_acl.go#L179-L184

So the order of google_storage_bucket_acl.foo.role_entity seems preserved for users.

How do you think?

danawillow · 2018-05-25T15:56:03Z

Changing it from a list to a set changes how it's set in the state, since instead of being saved as role_entity.0, role_entity.1, etc., it'll use a hash of the resource as the index (so something more like role_entity.3289345793). Does that make sense?

tmshn · 2018-05-25T16:11:02Z

Oh I got it. I found the code and understand; https://github.com/hashicorp/terraform/blob/v0.11.7/helper/schema/field_writer_map.go#L283

Thanks for the clear explanation 👍

tmshn · 2018-05-25T16:19:49Z

Well, then, it'll be impossible to do something like this?

resource "some_resource" "foo" {
  count   = "${length(google_storage_bucket_acl.foo.role_entity)}"
  somearg = "${element(google_storage_bucket_acl.foo.role_entity, count.index)}"
}

danawillow · 2018-05-25T16:24:23Z

I think that would work, but I need to check with @paddycarver. Paddy, would something like that start not working with a change to sets or is it only if people are explicitly using the index (like, field = "${google_storage_bucket_acl.foo.role_entity.0}")

paddycarver · 2018-06-20T00:13:39Z

So I tried this out using the following config:

data "google_compute_instance_group" "test" {
  name = "test-instance-group"
  zone = "us-central1-a"
}

data "template_file" "init" {
  count    = "${length(data.google_compute_instance_group.test.instances)}"
  template = "$${instance}"

  vars {
    instance = "${element(data.google_compute_instance_group.test.instances, count.index)}"
  }
}

And after applying (with an instance group with 4 instances) sure enough, 4 versions of the template were in state, and they listed all the instances.

Changing the count.index to 1 also worked, showing the same instance in four different templates, though knowing which it would be is difficult. However, using ${data.google_compute_instance_group.test.instances.1} broke things, returning 4, weirdly.

Also, after asking the core team, it seems that it working with element is not a thing that happened by design, but is likely the set getting converted to a list. Which means the ordering is theoretically unstable, even though I was unable to reproduce that.

So "not working" depends on how you define it. I think it will give some output, and even arguably correct output in some cases, though incorrect output in some cases. The important question I don't know the answer to is "will it give the same answer it would have as a list?"

Another question to consider here is how likely people are to be referencing the role_entity field inside the resource, and whether that risk is worth us delaying to 2.0.0 for.

Personally, I'm a bit gun-shy after 1.13.0, so I'm hesitant to break things on people, but I could be convinced with an argument that there's really no use case here that we're breaking, meaning it's a breaking change only in technicality, but pragmatically it's not.

morgante · 2018-06-20T21:42:11Z

IMO the benefit of fixing this (very annoying bug) outweighs the potential breaking change.

Indexing into the ACL is unlikely to be a common use case. At most, I can imagine people relying on the overall resource being provisioned but the actual role entities ACLs themselves are only useful for GCS ACLs and likely not referenced elsewhere.

I'd vote for including this fix sooner rather than later.

paddycarver · 2018-06-20T21:45:48Z

I agree that it's hard to envision situations that the fix would break in, but I think it's also a cosmetic fix, as the only real side-effect of the bug is extra API requests and it showing up in plan. It doesn't really affect the operation of your infrastructure. (It affects things that I, personally, Terraform, so I can attest to it both being annoying, but also relatively harmless, compared to something that, say, falsely forces a destroy/create cycle on every apply.)

morgante · 2018-06-20T21:57:05Z

I agree that it's hard to envision situations that the fix would break in, but I think it's also a cosmetic fix, as the only real side-effect of the bug is extra API requests and it showing up in plan.

I disagree that introducing a permadiff is merely a cosmetic problem. A lot of my infrastructure (and some Google internal tools) are built around the idea of ensuring that Terraform doesn't see any drift in resources. As such, this permadiff makes it impossible to validate that a plan has been fully applied with no differences — instead our jobs just continuously run, valiantly attempting to converge the infrastructure.

Even though it's ultimately unlikely to break any infrastructure, I think a terraform plan which can never be resolved is a relatively big issue.

Also, for what it's worth, this does introduce some problems with attempting to update other properties of the bucket ACLs which seem to have an undocumented race condition. Since this resource attempts to update every time, I keep running into that race condition even though I'm not actually changing anything. (Still working to figure out if it's a bug in GCS or a bug in Terraform.)

Add a CustomDiff function to storage bucket ACLs that will ignore a diff if the config and state have the same role_entities, even if they're in a different order. Fixes #1525.

paddycarver · 2018-06-22T06:58:15Z

I disagree that introducing a permadiff is merely a cosmetic problem. A lot of my infrastructure (and some Google internal tools) are built around the idea of ensuring that Terraform doesn't see any drift in resources. As such, this permadiff makes it impossible to validate that a plan has been fully applied with no differences — instead our jobs just continuously run, valiantly attempting to converge the infrastructure.

I think that's fair, and a point I hadn't considered. Thanks for raising it, I'll try to keep it in mind in the future. I think expecting Terraform to converge on a stable state is a reasonable expectation.

Even though it's ultimately unlikely to break any infrastructure, I think a terraform plan which can never be resolved is a relatively big issue.

I'd say I agree, but probably disagree about relative to what. For example, relative to a release the causes an error in a user's previously valid config, or worse, a release that causes a previously valid config to silently behave in a different way, I'd say a permadiff is the lesser of those three evils. It's something that should be resolved, for sure, but I don't think it necessarily outweighs the other issues at play, especially in the aftermath of a particularly disruptive release.

Also, for what it's worth, this does introduce some problems with attempting to update other properties of the bucket ACLs which seem to have an undocumented race condition. Since this resource attempts to update every time, I keep running into that race condition even though I'm not actually changing anything. (Still working to figure out if it's a bug in GCS or a bug in Terraform.)

That is bothersome! I think, personally, I'd prefer to resolve the race condition, regardless of what we do here, though I can appreciate this exacerbates the problem.

Fortunately, @ndmckinley suggested a way around this that avoided the breaking change and still resolved the permadiff situation today! I put together a PR for it, and opened #1692 to first of all make this fail in our tests, and second of all fix the problem.

tmshn · 2018-06-22T07:28:28Z

Good discussion. I agree with both of you:

indexing ACL role_entities less likely to happen, so this seems practically not a braking change
permanent diff is an unignorable harmful bug, but still relatively less severe

Anyway, @paddycarver opened a new solution in #1692, so we can close or pend this until v2.

Is there any schedule for v2?

paddycarver · 2018-06-22T07:50:54Z

indexing ACL role_entities less likely to happen, so this seems practically not a braking change

I agree in theory, but we would also break anyone:

using count and element (potentially, there seems to be no guarantee order will be stable, then, resulting in a permadiff--which is exactly what we're trying to avoid) or using count with [#] or .# notation (as it's now keyed off hash, not off position)
using interpolation functions that operate on lists, potentially--how likely is it that someone's using concat to combine two groups of role_entities into one? I have no idea.

I think it's unlikely to break most configs, but how many configs are in that minority that would be broken, I don't feel confident on. I think it could be zero, but I also wouldn't be terribly surprised if it were non-zero.

Is there any schedule for v2?

We don't have any date for v2 yet, and are trying to coordinate it with other things, which makes it hard to predict with any accuracy. We are definitely in the planning stages for it, however.

ghost · 2018-11-17T14:15:20Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

tmshn added 2 commits May 24, 2018 00:38

added test for storage_bucket_acl with unsorted role entities

a0c57c6

ignore role_entity order in storage_bucket_acl

1b5a191

danawillow added the v2 label May 25, 2018

rosbo requested review from paddycarver, rosbo and danawillow and removed request for paddycarver and rosbo May 28, 2018 21:57

paddycarver added the breaking-change label Jun 20, 2018

pdecat mentioned this pull request Jun 20, 2018

Permadiff on google_storage_default_object_acl #1678

Closed

paddycarver mentioned this pull request Jun 22, 2018

Fix the perma-diff in storage bucket ACLs #1692

Merged

paddycarver closed this in #1692 Jun 22, 2018

tmshn mentioned this pull request Sep 21, 2018

Order of google_bigquery_dataset.access should be ignored #2093

Closed

tmshn mentioned this pull request Oct 20, 2018

Ignore google_bigquery_dataset.access order #2179

Closed

ghost locked and limited conversation to collaborators Nov 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore role_entity order in storage_bucket_acl #1525

Ignore role_entity order in storage_bucket_acl #1525

tmshn commented May 23, 2018 •

edited

Loading

danawillow commented May 25, 2018

tmshn commented May 25, 2018

danawillow commented May 25, 2018

tmshn commented May 25, 2018

tmshn commented May 25, 2018

danawillow commented May 25, 2018

paddycarver commented Jun 20, 2018

morgante commented Jun 20, 2018

paddycarver commented Jun 20, 2018

morgante commented Jun 20, 2018

paddycarver commented Jun 22, 2018

tmshn commented Jun 22, 2018

paddycarver commented Jun 22, 2018 •

edited

Loading

ghost commented Nov 17, 2018

Ignore role_entity order in storage_bucket_acl #1525

Ignore role_entity order in storage_bucket_acl #1525

Conversation

tmshn commented May 23, 2018 • edited Loading

danawillow commented May 25, 2018

tmshn commented May 25, 2018

danawillow commented May 25, 2018

tmshn commented May 25, 2018

tmshn commented May 25, 2018

danawillow commented May 25, 2018

paddycarver commented Jun 20, 2018

morgante commented Jun 20, 2018

paddycarver commented Jun 20, 2018

morgante commented Jun 20, 2018

paddycarver commented Jun 22, 2018

tmshn commented Jun 22, 2018

paddycarver commented Jun 22, 2018 • edited Loading

ghost commented Nov 17, 2018

tmshn commented May 23, 2018 •

edited

Loading

paddycarver commented Jun 22, 2018 •

edited

Loading