Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google_project_iam_member/google_project_iam_binding Fails for roles/cloudsql.client, Works for Other #5107

Closed
jjorissen52 opened this issue Dec 6, 2019 · 46 comments
Assignees
Labels
bug persistent-bug Hard to diagnose or long lived bugs for which resolutions are more like feature work than bug work

Comments

@jjorissen52
Copy link

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
  • If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

Terraform v0.12.10
+ provider.google v2.20.0
+ provider.local v1.4.0
+ provider.null v2.1.2

Affected Resource(s)

  • google_service_account
  • google_project_iam_member

Terraform Configuration Files

# resources fail, but the below gcloud command succeeds
gcloud projects add-iam-policy-binding booklawyer-dev-259701 /
    --member serviceAccount:[email protected] \
    --role roles/cloudsql.client
resource "google_service_account" "sql_client" {
    account_id = "sql-client"
    display_name  = "sql-client"    
}

# fails
resource "google_project_iam_member" "sql_client" {
    role = "roles/cloudsql.client"

    member = "serviceAccount:${google_service_account.sql_client.email}"
    # these roles have to be created before they can be assigned 
    depends_on = ["google_service_account.sql_client"]
}

# also fails
# resource "google_project_iam_binding" "sql_client" {
#     role = "roles/cloudsql.client"

#     members = ["serviceAccount:${google_service_account.sql_client.email}", ]
#     # these roles have to be created before they can be assigned 
#     depends_on = ["google_service_account.sql_client"]
# }


resource "google_storage_bucket" "etl_config" {
  bucket_policy_only = true
  force_destroy      = true
  name               = "etl-config-${var.project.number}"
  requester_pays     = false
  storage_class      = "STANDARD"
}

# succeeds
resource "google_storage_bucket_iam_member" "etl_config" {
  bucket = "${google_storage_bucket.etl_config.name}"
  role        = "roles/storage.objectAdmin"
  member      = "serviceAccount:${google_service_account.sql_client.email}"
}

Debug Output

https://gist.github.com/jjorissen52/d253d274cdb763b47b55cbe3ee0f19e2

Expected Behavior

Binding should happen

Actual Behavior

Error: Batch "iam-project-booklawyer-dev-259701 modifyIamPolicy" for request "Create IAM Members roles/cloudsql.client serviceAccount:[email protected] for \"project \\\"booklawyer-dev-259701\\\"\"" returned error: Error applying IAM policy for project "booklawyer-dev-259701": Error setting IAM policy for project "booklawyer-dev-259701": googleapi: Error 400: Request contains an invalid argument., badRequest

  on ../etl/iam.tf line 20, in resource "google_project_iam_member" "sql_client":
  20: resource "google_project_iam_member" "sql_client" {

Steps to Reproduce

  1. terraform apply

Important Factoids

I have been able to use this exact resource setup to apply other roles to other service accounts.

References

Resolution here does not seem to work:

@ghost ghost added the bug label Dec 6, 2019
@morgante
Copy link

morgante commented Dec 11, 2019

I've been noticing the same error across many different projects as of today:

For example, this config is causing this error:

Step #0 - "prepare": Error: Batch "iam-project-ci-gcloud-b081 modifyIamPolicy" for request "Create IAM Members roles/owner serviceAccount:[email protected] for \"project \\\"ci-gcloud-b081\\\"\"" returned error: Error applying IAM policy for project "ci-gcloud-b081": Error setting IAM policy for project "ci-gcloud-b081": googleapi: Error 400: Policy members must be of the form "<type>:<value>"., badRequest
Step #0 - "prepare": 
Step #0 - "prepare":   on iam.tf line 29, in resource "google_project_iam_member" "int_test":
Step #0 - "prepare":   29: resource "google_project_iam_member" "int_test" {
Step #0 - "prepare": 

The error is quite confusing, because serviceAccount:[email protected] looks valid as an IAM member to me.

I think the right fix is likely to filter out deleted principles when sending the IAM policy back.

@megan07 megan07 self-assigned this Dec 11, 2019
@morgante
Copy link

morgante commented Dec 11, 2019

I've been doing a bit more investigation into this (tracked in #333). I've been able to consistently reproduce it on my project, here are the debug logs.

Looking at the logs, I suspect the issue is related to deleted IAM principles. Specifically, I see that we attempt to reflect a deleted IAM principle back in the setPolicy response.

I've also done some version testing:

  1. It does not occur on 2.12.0
  2. It does occur on 2.13.0
  3. It does occur on 3.1.0

Right now the best workaround I can find is to pin the provider to ~> 2.12.0.

@slevenick
Copy link
Collaborator

I've got a fix for this on the way: GoogleCloudPlatform/magic-modules#2819

@slevenick
Copy link
Collaborator

As a workaround until the fix is released you can delete service account IAM members with the deleted: prefix and terraform will work as usual.

This issue is caused specifically by deleted service accounts that exist on the resource that terraform is managing members on, so removing references to them will allow terraform to work normally.

@slevenick
Copy link
Collaborator

This fix is available now in the 2.20.1 version of the provider, and will be available for 3.x in the 3.3.0 release expected next week.

@jjorissen52
Copy link
Author

jjorissen52 commented Dec 16, 2019

@slevenick I've just attempted it after pinning v2.20.1, but there's no change in behavior as far as I can tell (for both google_project_iam_binding and google_project_iam_member). Any advice for me?

Terraform v0.12.10
+ provider.archive v1.3.0
+ provider.google v2.20.1
+ provider.local v1.4.0
+ provider.null v2.1.2

image

@slevenick
Copy link
Collaborator

@jjorissen52 can you provide debug logs for the failing run? That will help me debug what is going on

@jjorissen52
Copy link
Author

jjorissen52 commented Dec 16, 2019

@slevenick unfortunately, earlier today I bumped up to v3.2.0 on this project for an unrelated reason, and I am unable to downgrade again (trying to do so results in an error with terraform apply).

@slevenick
Copy link
Collaborator

The 3.3.0 release is expected to go out tomorrow which has this fix. Please let me know if you encounter the same issue with that version, but I'll close this until then.

I believe this issue has been fixed with 2.20.1 as I am unable to reproduce issues at this point

Downgrading from 3.x to 2.x is going to be difficult and not recommended

@madmaze
Copy link

madmaze commented Jan 4, 2020

I am definitely still encountering this issue with 2.20.1, is it possible that version does not yet include the fix? nvm, i checked the tag, the fix should be in there.

Error: Batch "iam-project-demo modifyIamPolicy" for request "Create IAM Members roles/stackdriver.resourceMetadata.writer serviceAccount:[email protected] for \"p
roject \\\"demo\\\"\"" returned error: Error applying IAM policy for project "demo": Error setting IAM policy for project "demo": googleapi: Error 400: Request contains an invalid argument., b
adRequest

  on .terraform/modules/gke_service_account/main.tf line 33, in resource "google_project_iam_member" "service_account-roles":
  33: resource "google_project_iam_member" "service_account-roles" {

I also upgraded everything to 3.3.0 and I'm still seeing that issue, if I blow everything away and go back to 2.12.0 everything still seems to work

@lobsterdore
Copy link

lobsterdore commented Jan 7, 2020

I have just tried this with version 3.4.0 and I am getting the same error, here's a code snippet:

resource "google_service_account" "cloud_sql" {
  account_id   = "dev-cloud-sql"
  display_name = "dev-cloud-sql"
}

resource "google_project_iam_binding" "cloud_sql_iam" {
  depends_on = [google_service_account.cloud_sql]
  role    = "roles/cloudsql.client"

  members = [
    "serviceAccount:${google_service_account.cloud_sql.email}"
  ]
}

Error output:

Error: Batch "iam-project-xxx modifyIamPolicy" for request "Set IAM Binding for role \"roles/cloudsql.client\" on \"project \\\"xxx\\\"\"" returned error: Error applying IAM policy for project "xxx": Error setting IAM policy for project "xxx": googleapi: Error 400: Request contains an invalid argument., badRequest

  on ../../../modules/db_database/main.tf line 20, in resource "google_project_iam_binding" "cloud_sql_iam":
  20: resource "google_project_iam_binding" "cloud_sql_iam" {

@slevenick
Copy link
Collaborator

@madmaze or @lobsterdore can you include a debug log for the failed apply?

I am able to apply the config provided with 3.3.0, but a debug log would help identify the issue

@jjorissen52
Copy link
Author

@slevenick , I just upgraded to v3.4.0 and can confirm that this is still affecting me. Debug Logs

Terraform v0.12.10
+ provider.archive v1.3.0
+ provider.google v3.4.0
+ provider.local v1.4.0
+ provider.null v2.1.2

terraform apply -target=module.booklawyer.module.etl.google_project_iam_binding.sql_client

Shows same error as before:

Error: Batch "iam-project-booklawyer-dev-259701 modifyIamPolicy" for request "Set IAM Binding for role \"roles/cloudsql.client\" on \"project \\\"booklawyer-dev-259701\\\"\"" returned error: Error applying IAM policy for project "booklawyer-dev-259701": Error setting IAM policy for project "booklawyer-dev-259701": googleapi: Error 400: Request contains an invalid argument., badRequest

  on ../etl/iam.tf line 12, in resource "google_project_iam_binding" "sql_client":
  12: resource "google_project_iam_binding" "sql_client" {

Debug Logs

@slevenick
Copy link
Collaborator

@jjorissen52 That is odd. Can you apply the same config on a new (clean) project?

I suspect that there is something strange happening with the IAM policy for your existing project. I believe this is an unrelated issue, but it presents with the same (not very helpful) error message.

Looking at the debug log, I would guess that this is causing the failure:

2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5:   {
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5:    "role": "roles/owner",
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5:    "members": [
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5:     "user:",
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5:     "user:",
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5:     "user:",
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5:     "user:"
2020-01-07T15:36:29.562-0600 [DEBUG] plugin.terraform-provider-google_v3.4.0_x5:    ]

Terraform receives an IAM policy that has a series of members named user: from the API. To my eye this looks blatantly wrong, and using the iam_binding resource within terraform attempts to preserve any existing members, so it posts the same series of user: members back.

I believe that removing these faulty members will cause terraform to succeed. Could you try either using the console or gcloud to remove these members, or using a project_iam_policy which is authoritative?

@slevenick slevenick reopened this Jan 7, 2020
@jjorissen52
Copy link
Author

jjorissen52 commented Jan 7, 2020

@slevenick Apologies, I manually modified those lines so as to not publish my co-workers email addresses. each of those lines once contained an [email protected]. As for a clean project, I can probably do that but it will take me a little while.

@slevenick
Copy link
Collaborator

Ok that makes sense.

I'm back to being confused about why this is happening. This seems unrelated to the other issues around deleted: IAM members, though it started occurring at the same time. It could possibly be related to changes in the IAM API that happened around the filing date of this issue

Were you able to successfully apply this config with versions of the provider after 2.12.0 prior to filing this issue?

What I'm trying to figure out is if this broke with the 2.13.0 release or if the combination of 2.13.0+ and the API changes that happened around Dec 6th are causing it.

@jjorissen52
Copy link
Author

@slevenick I had never attempted this particular role assignment (roles/cloudsql.client) using a resource "google_project_iam_binding" "" {} block before on any version, but I do have a project that assigns a role which currently uses provider.google v2.16.0.

resource "google_project_iam_binding" "cloudbuild-sa-user" {
    project = "${google_project_services.project.project}"
    role    = "roles/iam.serviceAccountUser"

    members = [
      "serviceAccount:${local.cloud_build_sa}",
    ]
}

Unfortunately, I cannot tell if this is the version that was used when creating the binding or if I've since updated the version; the state history does not seem to contain information about provider versions.

@madmaze
Copy link

madmaze commented Jan 8, 2020

I have a debug log of both v2.12.0 and v2.20.1, are there any specific parts that would be most valuable to share? I'm hesitant to share the whole log, its full of seemingly sensitive info.

I've cleaned up two snippets, 2.12.0 & 2.20.1 which seem relevant to me. Looks like besides the order, the sent data is exactly the same besides the etag (2.12.0 json & 2.20.1 json) which I'm not sure whether that's supposed to change.
https://gist.github.com/madmaze/ccda69be4ac861f6ac0fc15cdf9e8bf3

Two other differences seem to be in the headers:

  • User-Agent: terraform 0.12.4 vs terraform 0.12.13 (I only have 0.12.13 installed)
  • 2.20.1 also adds: X-Goog-Api-Client: gl-go/1.11.0 gdcl/20191007

@michyliao
Copy link
Contributor

michyliao commented Jan 13, 2020

I am also seeing this issue when applying iam_member with provider.google: version = "~> 3.4"

Error: Batch "iam-project-<project id> modifyIamPolicy" for request "Create IAM Members roles/storage.objectAdmin serviceAccount:<service-account-id>@<project-id>.iam.gserviceaccount.com for \"project \\\"<projet-id>\\\"\"" returned error: Error applying IAM policy for project "<project-id>": Error setting IAM policy for project "<project-id>": googleapi: Error 400: The role name must be in the form "roles/{role}", "organizations/{organization_id}/roles/{role}", or "projects/{project_id}/roles/{role}"., badRequest

  role       = "roles/storage.objectAdmin"
  member     = "serviceAccount:${module.module-name.email}"

In the debug logs, I am seeing this:
eval: *terraform.EvalMaybeTainted

@slevenick
Copy link
Collaborator

@michyliao that looks like a different issue. Can you file a separate issue with debug logs included?

@slevenick
Copy link
Collaborator

I'm unable to track this down by just the error message from the debug logs (invalid argument is very generic)

I'll probably need to be able to reproduce this to make further progress. @madmaze can you send me the full debug logs for a failing run? It would help to have the full request/response pair without any changes. If you don't want to post them publicly could you send them to my username @google.com

@jjorissen52
Copy link
Author

@slevenick It seems that, for the affected project, resource "google_project_iam_binding" always fails to apply. Should I update the title to more accurately describe the issue?

@akrasnov-drv
Copy link

Just today faced this bug and am very surprised that it's not fixed for months.
After wasting several hours I found that member/binding functions fail when there is a user (in the project) with Capital letter(s) in its ID (email)
Fortunately I had just 1 inactive user with Capital letters and I was able to remove it and apply my "google_project_iam_member" rules.

The error message " Error 400: Request contains an invalid argument., badReques" is misleading. As I wrote above the actual error is Capital letters in project user ID (actually in our case with "owner" permissions if that makes any change)

What's the most weird in this situation is that I can't add that user back with low case letters. Google checks the email I provide (lower case) in its user database(s) and adds it with Capital letters again.

Please fix.
// Hope this message will save to someone his/her time

@slevenick
Copy link
Collaborator

Hey @akrasnov-drv sorry that this caused issues for you.

How are you adding back the user with lower case letters? Can you give me an overview of your workflow, like are you using terraform to attempt to add this user back, but it gets sent as [email protected] and comes back as [email protected]?

@akrasnov-drv
Copy link

Hi @slevenick
User creation is not actually relevant to the case. It's just another side effect that adds troubles.
I created user in Google console (IAM). I specified lowercase [email protected], and Google found it, but then it added the user as [email protected] (likely it was initially registered so in gmail by the user)
The terraform google provider bug is that it can't work with such "unusually formatted" emails, and produces misleading error.
I understand that RFC defines email addresses as case insensitive. But Google keeps it case sensitive, therefor google provider should support this too.

@slevenick
Copy link
Collaborator

Hm, can you provide debug logs for the failing run? I'm unable to create a user with capital letters in their name. I have created a user with capital letters, but the IAM console only finds it as lowercase, which doesn't cause any issues.

@akrasnov-drv
Copy link

akrasnov-drv commented Mar 10, 2020

Yes, sure.
As I wrote before, Google provides the email it finds in its databases, and it keeps capital/lowercase as it's in its DB. I don't know if you can register new Google user with capital letters in email now, but it was definitely possible in the past.

Test code

  account_id = "del-me"
  display_name  = "bug test sa"
}
resource "google_project_iam_member" "bug_test_role" {
  role    = "roles/compute.instanceAdmin"
  member = "serviceAccount:${google_service_account.del_me.email}"
  depends_on = [google_service_account.del_me]
}

Error

google_project_iam_member.bug_test_role: Creating...

Error: Batch "iam-project-my-project modifyIamPolicy" for request "Create IAM Members roles/compute.instanceAdmin serviceAccount:[email protected] for \"project \\\"my-project\\\"\"" returned error: Error applying IAM policy for project "my-project": Error setting IAM policy for project "my-project": googleapi: Error 400: Request contains an invalid argument., badRequest. To debug individual requests, try disabling batching: https://www.terraform.io/docs/providers/google/guides/provider_reference.html#enable_batching

  on security.tf line 19, in resource "google_project_iam_member" "bug_test_role":
  19: resource "google_project_iam_member" "bug_test_role" {

The log (attached, with some security related masking) is for google-beta but it fails the same way for google too.

@akrasnov-drv
Copy link

tf.log

@slevenick
Copy link
Collaborator

That's very unusual. How did you create the user with capital letters, is it just an old email that existed?

And you have found that removing the user with capital letters allows you to apply the binding?

I'll ask around for why the API would be returning upper case values and if this is intended we should handle this correctly in Terraform

@akrasnov-drv
Copy link

  • likely yes, that's the email that user provided. Likely it's old.
  • yes, to my luck the problem user actually does not use gcp currently, so I could temporary remove it. Now all binding/membership works. As I wrote before, I tried to re-add the user in low case letters, but Google added it again with capital ones like it originally was (and you saw this behavior when you tried to add a user with capital letters). After that binding/membership stopped working again.

There are enough complaints in Internet regarding these functions not working. I believe all (or most) of them have this issue (user(s) with Upper case letter(s)). This should be handled by terraform provider. I do not believe Google will update it user databases (or API)...

@slevenick
Copy link
Collaborator

@jjorissen52 does your IAM policy have users with upper case letters?

I'm tracking down the intended behavior here, and will definitely handle this in the provider if needed

@jjorissen52
Copy link
Author

@slevenick The project does have one user with capital letters in the email, though none of bindings defined via terraform do anything with that user. Don't know if that makes a difference.

@akrasnov-drv
Copy link

Yes, I also do nothing with the problem user. But you can see it in debug and it brakes the workflow (I mean just existence of it).
@josephlewis42 if you have an option to (temporary) remove that user, you'll see it fixes your terraform processing.

@jjorissen52
Copy link
Author

@akrasnov-drv @slevenick That was it.

  1. Run apply with the binding. Failure.
  2. Remove user with capital letters in their Gmail account from IAM via cloud console.
  3. Run apply with binding. Success!

@slevenick
Copy link
Collaborator

@akrasnov-drv thank you for figuring out the root cause of this issue!

I still cannot reproduce, but it seems like this is a (somewhat) common case, so I'll find a fix

@lpzdvd-packlink
Copy link

Ended here facing same issue. I was using google_project_iam_member as

[email protected]

fiexed using:

serviceAccount:[email protected]

It's in doc anyway.

@slevenick
Copy link
Collaborator

I'm still having trouble reproducing this issue, and I believe that there is something strange going on with the particular emails being used here as emails are not handled case sensitively by the API.

Can I have one of you @akrasnov-drv or @jjorissen52 send me the actual email that is causing the problems? It will help me track down what exactly about these users is causing the issue.

You can send it to my github username @google.com

@akrasnov-drv
Copy link

Hi,
Have you seen email I sent you about a week ago?
Any progress?

@slevenick slevenick added the persistent-bug Hard to diagnose or long lived bugs for which resolutions are more like feature work than bug work label Mar 23, 2020
@innovia
Copy link

innovia commented Mar 24, 2020

@slevenick

I've hit the same issue today running terraform gke public module

I believe that the issue happens when attempting to add a role to a new service account (existing policy), you have to first fetch the policy which includes the user with the capital letter, then append to it and apply it.

If you can point me to the code where this is done I can try to replicate it using gcloud CLI, and see if its an SKD issue or implementation issue (usually the SDK will make fixes to it before applying it)

update:

Im unable to replicate it on a single role, already containing a CamelCase user name, maybe its an issue with size of the payload?

resource "google_service_account" "sa" {
  account_id   = "terratest"
  display_name = "Terratest Service Account"
}

resource "google_project_iam_member" "log_writer" {
  project = "ami-playground"
  role    = "roles/logging.logWriter"
  member  = "serviceAccount:${google_service_account.sa.email}"
}

@slevenick
Copy link
Collaborator

Surprisingly I'm unable to reproduce this issue in my own project. If I add a user with a capital letter, it behaves the same way as in all of the cases described here, where Terraform lowercases any capital letters coming from the API, but in all of my cases the API accepts the lowercase version.

For example, the API will return:

"role": "roles/browser",
"members": [
"user:[email protected]"
 ],

I add a binding with a different user, posting back a policy with

"role": "roles/browser",
"members": [
"user:[email protected]"
 ],

Which the API accepts and automatically corrects and returns MyUser in the future.

I'm trying to debug with the team internally, and may reach out to some of you for help in reproducing this for them

@akrasnov-drv
Copy link

@slevenick
Thank you for the efforts :)
Try using the user I sent you by mail. In my project it breaks binding functions with 100% consistency. I added and removed it already about 5-7 times.
In my project this user has "owner" rights if it changes anything.

@cee-dub
Copy link

cee-dub commented Apr 2, 2020

I was just experiencing what seems like a related issue to this and #4276 and was able to solve it. Maybe this can help others in the thread.

I have a resource "google_project_iam_custom_role", a data "google_iam_policy" (not certain this is required), and a resource "google_project_iam_member". The API was returning the error googleapi: Error 400: Role roles/myCustomRole is not supported for this resource., badRequest when trying to create the google_project_iam_member.

Returned the badRequest error:

resource "google_project_iam_member" "mem" {
  role   = "roles/${google_project_iam_custom_role.role.role_id}"
  member = "serviceAccount:${data.google_service_account.sa.email}"
}

Succeeded:

resource "google_project_iam_member" "mem" {
  role   = "projects/${var.project}/roles/${google_project_iam_custom_role.role.role_id}"
  member = "serviceAccount:${data.google_service_account.sa.email}"
}

@slevenick
Copy link
Collaborator

Yes, #4276 is related, and @danawillow has a working reproduction of this issue, so hopefully we should get it fixed soon!

I'll close this as a duplicate at this point as #4276 is the same issue

@ghost
Copy link

ghost commented May 3, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators May 3, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug persistent-bug Hard to diagnose or long lived bugs for which resolutions are more like feature work than bug work
Projects
None yet
Development

No branches or pull requests