Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

project: check for billing account permissions as pre-requisite #5671

Closed
wants to merge 7 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions google/resource_google_project.go
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,10 @@ func resourceGoogleProject() *schema.Resource {
func resourceGoogleProjectCreate(d *schema.ResourceData, meta interface{}) error {
config := meta.(*Config)

if err := resourceGoogleProjectCheckPreRequisites(config, d); err != nil {
return fmt.Errorf("failed pre-requisites: %v", err)
}

var pid string
var err error
pid = d.Get("project_id").(string)
Expand Down Expand Up @@ -173,6 +177,26 @@ func resourceGoogleProjectCreate(d *schema.ResourceData, meta interface{}) error
return nil
}

func resourceGoogleProjectCheckPreRequisites(config *Config, d *schema.ResourceData) error {
ib, ok := d.GetOk("billing_account")
if !ok {
return nil
}
ba := "billingAccounts/" + ib.(string)
const perm = "billing.resourceAssociations.create"
req := &cloudbilling.TestIamPermissionsRequest{
Permissions: []string{perm},
}
resp, err := config.clientBilling.BillingAccounts.TestIamPermissions(ba, req).Do()
if err != nil {
return fmt.Errorf("failed to check permissions on billing account %q: %v", ba, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's super edge case but I can see a possibility that a user will fail this call but still be able to successfully create the project and update the billing account. For example a transient network error could cause this to fail but since we wrap the call to create in retries it will survive these such errors.

You could wrap this in retries, but IMO we can just log this at WARN and return nil as this is good enough for the 99% case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one of the problems the project factory preconditions script tries to solve is when a service account is running but the account's project hasn't enabled certain APIs. So for example if the project didn't enable cloudbilling api then this test iam permissions call would fail I imagine. Thus, I think we should leave this check in. I can make it a retry if you prefer.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisst, if this check failed as just a warning in the logs, how would a typical user notice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danawillow It will just revert to existing behavior:
a) if the permission isn't valid, it would taint the resource
b) if the permission is valid, the resource would create correctly.

Since this is just a precondition check it doesn't seem legitimate that a precondition should fail the resource creation for a transient error.

Relying on the failure to fetch permissions as a signal that api's aren't correctly enabled seems like it could just be another explicit check. We have a similar tainting problem when trying to delete the default network but compute.googleapis.com isn't enabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umairidris I'd be happy with either explicit checks for api enablement + silent failures or retries added. My primary concern is not adding a new category of failing to create a project

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood what @chrisst was saying at first, but I get it now. Basically, if the call to retrieve permissions fails, then we should ignore that because it was likely transient. but if the call succeeds and the permission isn't there, then fail before trying to create the project. Checking for API enablement is something we also can do first to make sure that the call to retrieve permissions fails for only transient or new reasons, not ones we already understand, and retries on the testIamPermissions call wouldn't fix that because they would just fail every time. I'm all caught up now :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a way to check if an API is enabled on the caller? I haven't been able to find a good way apart from trying to just use the API client to do something...

That also makes checking for compute api tricky as you need a project to be able to call the API on, but you don't have the project in the pre-requisites check...

The preconditions script tries to work around this by calling list APIs on the project set in gcloud config but this isn't always a valid check. For example if I am running from my local workstation with my own credentials then it doesn't matter what project I set in gcloud, I believe it uses the default gcloud project for my API calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

friendly ping @chrisst

Copy link
Contributor

@chrisst chrisst Feb 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I didn't think about the chicken/egg problem with enabling the compute API. I did some digging and I'm not sure if billing accounts are a parent type that can have services enabled/disabled on it. I think this change is definitely an improvement as is, so I'm happy accepting that there's a small chance somebody will fail due to a transitive error. If we see people filing issues about it we can reassess.

Copy link
Contributor Author

@umairidris umairidris Feb 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chrisst I was thinking, if we create a compute API client and make a call on a non-existent project, would the error be the project doesn't exist or that the API is not enabled on the caller? I am guessing it would be the latter. If so, would this be an appropriate "hack" to work around this? i.e. call the API and check if the error is API not enabled OR check the error is 'resource not found' which implies the API must be enabled?

I can do some testing around this idea...

}
if !stringInSlice(resp.Permissions, perm) {
return fmt.Errorf("missing permission on %q: %v", ba, perm)
}
return nil
}

func resourceGoogleProjectRead(d *schema.ResourceData, meta interface{}) error {
config := meta.(*Config)
parts := strings.Split(d.Id(), "/")
Expand Down