-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HCPF-1961: Implement retry mechanism for Get and Set Policy #870
Conversation
@@ -64,7 +67,7 @@ func (u *orgIAMPolicyUpdater) GetResourceIamPolicy(ctx context.Context) (*models | |||
params.ID = u.client.Config.OrganizationID | |||
res, err := u.client.Organization.OrganizationServiceGetIamPolicy(params, nil) | |||
if err != nil { | |||
diags.AddError("failed to retrieve organization IAM policy", err.Error()) | |||
diags.Append(customdiags.NewErrorDiagnosticWithErrorCode("failed to retrieve organization IAM policy", err.Error(), status.Code(err))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So status.Code(err)
won't work (99% sure). The issue is that the status.Code() method is expecting the error type to be a grpc/status type: https://github.com/grpc/grpc-go/blob/v1.64.0/status/status.go#L96-L145
Since this error is coming from an HTTP request it is not being typed as a Status error. Instead it will be typed as: https://github.com/hashicorp/hcp-sdk-go/blob/main/clients/cloud-resource-manager/stable/2019-12-10/client/organization_service/organization_service_get_iam_policy_responses.go#L114-L118
The NewErrorDiagnosticWithErrorCode()
could maybe be replaced to expect a type that has a method of GetPayload() *cloud.GoogleRPCStatus
or Code() int
and use that instead?
Alternatively the error handling here could retrieve and pass the code by converting the error to the underlying type that has the Code() method.
// Apply the policy | ||
ep, diags = u.SetResourceIamPolicy(ctx, FromMap(ep.Etag, bindings)) | ||
if diags.HasError() { | ||
if customdiags.HasConflictError(diags) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment here explaining that the policy object has changed and the etag is no longer valid and how the retry works.
time.Sleep(backoff) | ||
backoff *= 2 | ||
if backoff > maxBackoff { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm if backoff is at maxBackoff, we will still sleep maxBackoff and then fail. Should we check we are over the max before sleeping.
ep, diags = u.SetResourceIamPolicy(ctx, FromMap(ep.Etag, bindings)) | ||
if diags.HasError() { | ||
if customdiags.HasConflictError(diags) { | ||
log.Printf("[DEBUG]: Operation failed due to conflicts. Operation will be restarted after %s\n", backoff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With log package you shouldn't need the newline character: https://go.dev/play/p/PCT9cNfOYGu
) | ||
|
||
// ErrorDiagnosticWithErrorCode is an error diagnostic that stored the error code. | ||
type ErrorDiagnosticWithErrorCode struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two comments:
- Can we be explicit that this is the HTTPStatusCode. Since there is both the gRPC Status Code and the HTTP Status Code, it is slightly ambiguous what code to check.
- Since we are in a package called
customdiags
, what do you think about shortening the struct name to something likeErrorHTTPStatusCode
since it will be imported ascustomdiags.ErrorHTTPStatusCode
type ErrorDiagnosticWithErrorCode struct { | ||
detail string | ||
summary string | ||
errorCode int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be public so the code can be inspected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add documentation to:
terraform-provider-hcp/internal/clients/iampolicy/iam.go
Lines 26 to 27 in c3f345c
// Replaces the existing IAM Policy attached to a resource. | |
SetResourceIamPolicy(ctx context.Context, policy *models.HashicorpCloudResourcemanagerPolicy) (*models.HashicorpCloudResourcemanagerPolicy, diag.Diagnostics) |
if customdiags.HasConflictError(diags) { | ||
// Policy object has changed since it was last gotten and the etag is now different. | ||
// Continuously retry getting and setting the policy with an increasing backoff period until the maximum backoff period is reached. | ||
log.Printf("[DEBUG]: Operation failed due to conflicts. Operation will be restarted after %s", backoff) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this log line be below the maxBackoff if statement? That way it either say there operation will be retried or says it has aborted, but not both.
c3f345c
to
6b79745
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just chiming in here from the identity team. Looks like our tests still pass with these changes. Was there any immediate action that we needed to take on our end to accommodate these changes?
@delores-hashicorp Did the same diags.Append(customdiags.NewErrorHTTPStatusCode(...
bit need to be applied to the Group Policy resources as well?
Tests passing:
=== RUN TestAccGroupIamPolicyResource
--- PASS: TestAccGroupIamPolicyResource (11.43s)
=== RUN TestAccGroupIamBindingResource
--- PASS: TestAccGroupIamBindingResource (14.36s)
PASS
coverage: 70.2% of statements
ok github.com/hashicorp/terraform-provider-hcp/internal/provider/iam 156.022s
@@ -20,12 +20,14 @@ import ( | |||
// Implementations should be created per resource and should keep track of the | |||
// resource identifier. | |||
type ResourceIamUpdater interface { | |||
// Fetch the existing IAM policy attached to a resource. | |||
// GetResourceIamPolicy fetches the existing IAM policy attached to a resource. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning up the comments ✨
Thanks for the catch. I didn't realize it had been implemented |
44aeceb
to
c7d9a4c
Compare
🛠️ Description
🏗️ Acceptance tests
Output from acceptance testing:
$ make testacc TESTARGS='-run=TestAccVaultSecretsAppIamBindingResource' ok github.com/hashicorp/terraform-provider-hcp/internal/provider/resourcemanager 1.390s [no tests to run] === RUN TestAccVaultSecretsAppIamBindingResource --- PASS: TestAccVaultSecretsAppIamBindingResource (8.67s)
$ make testacc TESTARGS='-run=TestAccProjectIamBindingResource' === RUN TestAccProjectIamBindingResource --- PASS: TestAccProjectIamBindingResource (15.90s) PASS ok github.com/hashicorp/terraform-provider-hcp/internal/provider/resourcemanager 16.784s
$ make testacc TESTARGS='-run=TestAccProjectIamPolicyResource' === RUN TestAccProjectIamPolicyResource --- PASS: TestAccProjectIamPolicyResource (10.59s)
$ make testacc TESTARGS='-run=TestAccOrganizationIamBindingResource' === RUN TestAccOrganizationIamBindingResource --- PASS: TestAccOrganizationIamBindingResource (15.02s)