Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

project: check for billing account permissions as pre-requisite #5671

Closed
wants to merge 7 commits into from
Closed

project: check for billing account permissions as pre-requisite #5671

wants to merge 7 commits into from

Conversation

umairidris
Copy link
Contributor

@umairidris umairidris commented Feb 13, 2020

Currently, the provider will create the project but then fail the billing account link step leaving the resource in a 'tainted' state. This pre-emptively checks whether the caller will have access to the billing account before the project is created.

We now get an error like:

Error: failed pre-requisites: missing permissions on "billingAccounts/abc": [billing.resourceAssociations.create]

terraform-google-modules/terraform-google-project-factory#373

@ghost ghost added the size/s label Feb 13, 2020
@ghost ghost requested review from megan07 February 13, 2020 16:06
@umairidris umairidris changed the title project: check for billing account perms as pre-requisite project: check for billing account permissions as pre-requisite Feb 13, 2020
Copy link
Collaborator

@rileykarson rileykarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very drive-by and very not necessary: I would love to see this as a CustomizeDiff instead of an apply-time check. I think we could fail out at plan time for an even better user experience.

Copy link
Contributor

@chrisst chrisst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this! I have a couple quick things.

}
resp, err := config.clientBilling.BillingAccounts.TestIamPermissions(ba, req).Do()
if err != nil {
return fmt.Errorf("failed to check permissions on billing account %q: %v", ba, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's super edge case but I can see a possibility that a user will fail this call but still be able to successfully create the project and update the billing account. For example a transient network error could cause this to fail but since we wrap the call to create in retries it will survive these such errors.

You could wrap this in retries, but IMO we can just log this at WARN and return nil as this is good enough for the 99% case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one of the problems the project factory preconditions script tries to solve is when a service account is running but the account's project hasn't enabled certain APIs. So for example if the project didn't enable cloudbilling api then this test iam permissions call would fail I imagine. Thus, I think we should leave this check in. I can make it a retry if you prefer.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisst, if this check failed as just a warning in the logs, how would a typical user notice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danawillow It will just revert to existing behavior:
a) if the permission isn't valid, it would taint the resource
b) if the permission is valid, the resource would create correctly.

Since this is just a precondition check it doesn't seem legitimate that a precondition should fail the resource creation for a transient error.

Relying on the failure to fetch permissions as a signal that api's aren't correctly enabled seems like it could just be another explicit check. We have a similar tainting problem when trying to delete the default network but compute.googleapis.com isn't enabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umairidris I'd be happy with either explicit checks for api enablement + silent failures or retries added. My primary concern is not adding a new category of failing to create a project

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood what @chrisst was saying at first, but I get it now. Basically, if the call to retrieve permissions fails, then we should ignore that because it was likely transient. but if the call succeeds and the permission isn't there, then fail before trying to create the project. Checking for API enablement is something we also can do first to make sure that the call to retrieve permissions fails for only transient or new reasons, not ones we already understand, and retries on the testIamPermissions call wouldn't fix that because they would just fail every time. I'm all caught up now :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a way to check if an API is enabled on the caller? I haven't been able to find a good way apart from trying to just use the API client to do something...

That also makes checking for compute api tricky as you need a project to be able to call the API on, but you don't have the project in the pre-requisites check...

The preconditions script tries to work around this by calling list APIs on the project set in gcloud config but this isn't always a valid check. For example if I am running from my local workstation with my own credentials then it doesn't matter what project I set in gcloud, I believe it uses the default gcloud project for my API calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

friendly ping @chrisst

Copy link
Contributor

@chrisst chrisst Feb 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I didn't think about the chicken/egg problem with enabling the compute API. I did some digging and I'm not sure if billing accounts are a parent type that can have services enabled/disabled on it. I think this change is definitely an improvement as is, so I'm happy accepting that there's a small chance somebody will fail due to a transitive error. If we see people filing issues about it we can reassess.

Copy link
Contributor Author

@umairidris umairidris Feb 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisst I was thinking, if we create a compute API client and make a call on a non-existent project, would the error be the project doesn't exist or that the API is not enabled on the caller? I am guessing it would be the latter. If so, would this be an appropriate "hack" to work around this? i.e. call the API and check if the error is API not enabled OR check the error is 'resource not found' which implies the API must be enabled?

I can do some testing around this idea...

if err != nil {
return fmt.Errorf("failed to check permissions on billing account %q: %v", ba, err)
}
if diff := diffStringSlices(resp.Permissions, req.Permissions); len(diff) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit overly complicated. Can you just loop through resp.Permissions and make sure that billing.resourceAssociations.create is present instead of doing this slice comparison?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@chrisst
Copy link
Contributor

chrisst commented Feb 13, 2020

@rileykarson that's an interesting thought. It would have to be restricted to only call/fail if the resource was just about to be created (ie not on update). Also in theory a user could be generating a plan output with lower credentials to be delegated to an account with higher privileges and failing at plan time could break that work flow.
For now I think the approach in this PR is a good conservative option.

@chrisst chrisst self-assigned this Feb 13, 2020
if err != nil {
return fmt.Errorf("failed to check permissions on billing account %q: %v", ba, err)
}
if has := sliceContains(perm, resp.Permissions); !has {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we already have a function in this package that does this, so you can use stringInSlice(resp.Permissions, perm). You also don't use the value of has anywhere so you could just do:

if !stringInSlice(resp.Permissions, perm) {
...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@ghost ghost added size/xs and removed size/s labels Feb 14, 2020
@ghost
Copy link

ghost commented Mar 21, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

@ghost ghost locked and limited conversation to collaborators Mar 21, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants