Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retryable transport for errors #1704

Merged
merged 60 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
8d199e7
Add retryable transport for errors
dcfranca May 29, 2023
5ae5432
Update documentation with new parameters for retry
dcfranca May 29, 2023
df9f8d5
Change default of max_retries to 3
dcfranca May 31, 2023
e41f935
Fix typo in naming
dcfranca May 31, 2023
0da71b9
Update github/transport_test.go
kfcampbell Jun 1, 2023
0b3ba89
Merge branch 'main' into add-retries
kfcampbell Jun 1, 2023
c2c341c
Add error check for data seek
dcfranca Jun 3, 2023
e31cac0
Merge branch 'add-retries' of github.com:dcfranca/terraform-provider-…
dcfranca Jun 3, 2023
88e9ae8
Merge branch 'main' into add-retries
dcfranca Jun 6, 2023
2aa7bfd
Merge branch 'main' into add-retries
dcfranca Jun 15, 2023
bb97a82
Merge branch 'main' into add-retries
dcfranca Jun 16, 2023
61d9a17
Merge branch 'main' into add-retries
dcfranca Jun 20, 2023
27ded6c
Merge branch 'main' into add-retries
dcfranca Jun 21, 2023
218287f
Consolidate the default retriable errors on a function
dcfranca Jun 22, 2023
bd787ef
Merge branch 'add-retries' of github.com:dcfranca/terraform-provider-…
dcfranca Jun 22, 2023
9b98c08
Merge branch 'main' into add-retries
kfcampbell Jun 22, 2023
6d9b78d
Merge branch 'main' into add-retries
kfcampbell Jun 26, 2023
c2f8580
Fix typo on the comments
dcfranca Jun 27, 2023
6ad697e
Merge branch 'main' into add-retries
dcfranca Jun 27, 2023
9bb406a
Merge branch 'main' into add-retries
dcfranca Jun 28, 2023
5f93d2e
Update vendor
dcfranca Jul 4, 2023
4d51e0d
Fix merging conflicts
dcfranca Jul 11, 2023
7ad4857
Merge branch 'main' into add-retries
dcfranca Jul 12, 2023
8962435
Fix merging conflicts
dcfranca Jul 14, 2023
70a8ec2
Update documentation with new parameters for retry
dcfranca May 29, 2023
9a682ae
Change default of max_retries to 3
dcfranca May 31, 2023
1e347cc
Fix typo in naming
dcfranca May 31, 2023
0f944d2
Add error check for data seek
dcfranca Jun 3, 2023
09a1dc5
Update github/transport_test.go
kfcampbell Jun 1, 2023
e546022
Consolidate the default retriable errors on a function
dcfranca Jun 22, 2023
f3d3ce3
Fix typo on the comments
dcfranca Jun 27, 2023
5f83f00
Don't run go mod tidy on release (#1788)
kfcampbell Jul 11, 2023
f75c2de
Merge branch 'add-retries' of github.com:dcfranca/terraform-provider-…
dcfranca Jul 14, 2023
f4a3723
Update vendors
dcfranca Jul 14, 2023
9248f3d
Merge branch 'main' into add-retries
kfcampbell Jul 14, 2023
66bbc57
Merge branch 'main' into add-retries
dcfranca Jul 21, 2023
5af0b6f
Merge branch 'main' into add-retries
kfcampbell Jul 24, 2023
db8d50e
Merge branch 'main' into add-retries
dcfranca Jul 28, 2023
f402f24
Merge branch 'main' into add-retries
dcfranca Aug 3, 2023
a234ece
Merge branch 'main' into add-retries
dcfranca Aug 7, 2023
6ba0735
Merge branch 'main' into add-retries
dcfranca Aug 10, 2023
f4274eb
Merge branch 'main' into add-retries
dcfranca Aug 14, 2023
e9534c7
Merge branch 'main' into add-retries
dcfranca Aug 15, 2023
b40c2b7
Merge branch 'main' into add-retries
dcfranca Aug 21, 2023
61e196a
Merge branch 'main' into add-retries
dcfranca Aug 23, 2023
bf61de6
Merge branch 'main' into add-retries
dcfranca Aug 29, 2023
7de4a84
Merge branch 'main' into add-retries
dcfranca Sep 1, 2023
679e033
Merge branch 'main' into add-retries
dcfranca Sep 11, 2023
564ed52
Merge branch 'main' into add-retries
dcfranca Sep 13, 2023
8465c92
Merge branch 'main' into add-retries
dcfranca Sep 19, 2023
3221167
Merge branch 'main' into add-retries
dcfranca Sep 25, 2023
c1b5e49
Merge branch 'main' into add-retries
dcfranca Sep 27, 2023
ef547ad
Merge branch 'main' into add-retries
dcfranca Oct 3, 2023
4d63684
Merge branch 'main' into add-retries
dcfranca Oct 9, 2023
053fd0b
Merge branch 'main' into add-retries
dcfranca Oct 12, 2023
fff38a1
Merge branch 'main' into add-retries
dcfranca Oct 14, 2023
4f27468
Merge branch 'main' into add-retries
nickfloyd Oct 16, 2023
23b3302
Merge branch 'main' into add-retries
dcfranca Nov 8, 2023
a2c18d9
Merge branch 'main' into add-retries
kfcampbell Jan 10, 2024
23634d2
Fix lint with APIMeta deprecation
kfcampbell Jan 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 10 additions & 3 deletions github/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ type Config struct {
Insecure bool
WriteDelay time.Duration
ReadDelay time.Duration
RetryDelay time.Duration
RetryableErrors map[int]bool
dcfranca marked this conversation as resolved.
Show resolved Hide resolved
MaxRetries int
ParallelRequests bool
}

Expand All @@ -34,7 +37,7 @@ type Owner struct {
IsOrganization bool
}

func RateLimitedHTTPClient(client *http.Client, writeDelay time.Duration, readDelay time.Duration, parallelRequests bool) *http.Client {
func RateLimitedHTTPClient(client *http.Client, writeDelay time.Duration, readDelay time.Duration, retryDelay time.Duration, parallelRequests bool, retryableErrors map[int]bool, maxRetries int) *http.Client {

client.Transport = NewEtagTransport(client.Transport)
client.Transport = NewRateLimitTransport(client.Transport, WithWriteDelay(writeDelay), WithReadDelay(readDelay), WithParallelRequests(parallelRequests))
Expand All @@ -44,6 +47,10 @@ func RateLimitedHTTPClient(client *http.Client, writeDelay time.Duration, readDe
"Accept": "application/vnd.github.stone-crop-preview+json",
}, client.Transport)

if maxRetries > 0 {
client.Transport = NewRetryTransport(client.Transport, WithRetryDelay(retryDelay), WithRetryableErrors(retryableErrors), WithMaxRetries(maxRetries))
}

return client
}

Expand All @@ -55,7 +62,7 @@ func (c *Config) AuthenticatedHTTPClient() *http.Client {
)
client := oauth2.NewClient(ctx, ts)

return RateLimitedHTTPClient(client, c.WriteDelay, c.ReadDelay, c.ParallelRequests)
return RateLimitedHTTPClient(client, c.WriteDelay, c.ReadDelay, c.RetryDelay, c.ParallelRequests, c.RetryableErrors, c.MaxRetries)
}

func (c *Config) Anonymous() bool {
Expand All @@ -64,7 +71,7 @@ func (c *Config) Anonymous() bool {

func (c *Config) AnonymousHTTPClient() *http.Client {
client := &http.Client{Transport: &http.Transport{}}
return RateLimitedHTTPClient(client, c.WriteDelay, c.ReadDelay, c.ParallelRequests)
return RateLimitedHTTPClient(client, c.WriteDelay, c.ReadDelay, c.RetryDelay, c.ParallelRequests, c.RetryableErrors, c.MaxRetries)
}

func (c *Config) NewGraphQLClient(client *http.Client) (*githubv4.Client, error) {
Expand Down
25 changes: 25 additions & 0 deletions github/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,31 @@ func TestAccConfigMeta(t *testing.T) {

})

t.Run("returns a v3 REST API client with max retries", func(t *testing.T) {

config := Config{
Token: testToken,
BaseURL: "https://api.github.com/",
RetryableErrors: map[int]bool{
500: true,
502: true,
},
MaxRetries: 3,
}
meta, err := config.Meta()
if err != nil {
t.Fatalf("failed to return meta without error: %s", err.Error())
}

ctx := context.Background()
client := meta.(*Owner).v3client
_, _, err = client.APIMeta(ctx)
if err != nil {
t.Fatalf("failed to validate returned client without error: %s", err.Error())
}

})

t.Run("returns a v4 GraphQL API client to manage individual resources", func(t *testing.T) {

config := Config{
Expand Down
60 changes: 60 additions & 0 deletions github/provider.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,26 @@ func Provider() terraform.ResourceProvider {
DefaultFunc: schema.EnvDefaultFunc("GITHUB_OWNER", nil),
Description: descriptions["owner"],
},
"retryable_errors": {
Type: schema.TypeList,
Elem: &schema.Schema{Type: schema.TypeInt},
Optional: true,
DefaultFunc: func() (interface{}, error) {
defaultErrors := []int{500, 502, 503, 504}
errorInterfaces := make([]interface{}, len(defaultErrors))
for i, v := range defaultErrors {
errorInterfaces[i] = v
}
return errorInterfaces, nil
},
Description: descriptions["retryable_errors"],
},
"max_retries": {
kfcampbell marked this conversation as resolved.
Show resolved Hide resolved
Type: schema.TypeInt,
Optional: true,
Default: 3,
Description: descriptions["max_retries"],
},
"organization": {
Type: schema.TypeString,
Optional: true,
Expand Down Expand Up @@ -57,6 +77,12 @@ func Provider() terraform.ResourceProvider {
Default: 0,
Description: descriptions["read_delay_ms"],
},
"retry_delay_ms": {
Type: schema.TypeInt,
Optional: true,
Default: 1000,
Description: descriptions["retry_delay_ms"],
},
"parallel_requests": {
Type: schema.TypeBool,
Optional: true,
Expand Down Expand Up @@ -238,11 +264,17 @@ func init() {
"Defaults to 1000ms or 1s if not set.",
"read_delay_ms": "Amount of time in milliseconds to sleep in between non-write requests to GitHub API. " +
"Defaults to 0ms if not set.",
"retry_delay_ms": "Amount of time in milliseconds to sleep in between requests to GitHub API after an error response. " +
"Defaults to 1000ms or 1s if not set, the max_retries must be set to greater than zero.",
"parallel_requests": "Allow the provider to make parallel API calls to GitHub. " +
"You may want to set it to true when you have a private Github Enterprise without strict rate limits. " +
"Although, it is not possible to enable this setting on github.com " +
"because we enforce the respect of github.com's best practices to avoid hitting abuse rate limits" +
"Defaults to false if not set",
"retryable_errors": "Allow the provider to retry after receiving an error status code, the max_retries should be set for this to work" +
"Defaults to [500, 502, 503, 504]",
"max_retries": "Number of times to retry a request after receiving an error status code" +
"Defaults to 3",
}
}

Expand Down Expand Up @@ -325,6 +357,31 @@ func providerConfigure(p *schema.Provider) schema.ConfigureFunc {
}
log.Printf("[DEBUG] Setting read_delay_ms to %d", readDelay)

retryDelay := d.Get("read_delay_ms").(int)
if retryDelay < 0 {
return nil, fmt.Errorf("retry_delay_ms must be greater than or equal to 0ms")
}
log.Printf("[DEBUG] Setting retry_delay_ms to %d", retryDelay)

maxRetries := d.Get("max_retries").(int)
if maxRetries < 0 {
return nil, fmt.Errorf("max_retries must be greater than or equal to 0")
}
log.Printf("[DEBUG] Setting max_retries to %d", maxRetries)
retryableErrors := make(map[int]bool)
if maxRetries > 0 {
reParam := d.Get("retryable_errors").([]interface{})
if len(reParam) == 0 {
reParam = []interface{}{500, 502, 503, 504}
}

for _, status := range reParam {
retryableErrors[status.(int)] = true
}

log.Printf("[DEBUG] Setting retriableErrors to %v", retryableErrors)
}

parallelRequests := d.Get("parallel_requests").(bool)
isGithubDotCom, err := regexp.MatchString("^"+regexp.QuoteMeta("https://api.github.com"), baseURL)

Expand All @@ -343,6 +400,9 @@ func providerConfigure(p *schema.Provider) schema.ConfigureFunc {
Owner: owner,
WriteDelay: time.Duration(writeDelay) * time.Millisecond,
ReadDelay: time.Duration(readDelay) * time.Millisecond,
RetryDelay: time.Duration(retryDelay) * time.Millisecond,
RetryableErrors: retryableErrors,
MaxRetries: maxRetries,
ParallelRequests: parallelRequests,
}

Expand Down
25 changes: 25 additions & 0 deletions github/provider_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -162,4 +162,29 @@ func TestAccProviderConfigure(t *testing.T) {
})

})

t.Run("can be configured with max retries", func(t *testing.T) {

config := fmt.Sprintf(`
provider "github" {
token = "%s"
owner = "%s"
max_retries = 3
}`,
testToken, testOwnerFunc(),
)

resource.Test(t, resource.TestCase{
PreCheck: func() { skipUnlessMode(t, individual) },
Providers: testAccProviders,
Steps: []resource.TestStep{
{
Config: config,
ExpectNonEmptyPlan: false,
},
},
})

})

}
99 changes: 99 additions & 0 deletions github/transport.go
Original file line number Diff line number Diff line change
Expand Up @@ -197,3 +197,102 @@ func isWriteMethod(method string) bool {
}
return false
}

type RetryTransport struct {
transport http.RoundTripper
retryDelay time.Duration
maxRetries int
retryableErrors map[int]bool
}

type RetryTransportOption func(*RetryTransport)

// NewRetryTransport takes in an http.RoundTripper and a variadic list of
// optional functions that modify the RetryTransport struct itself. This
// may be used to retry after response errors 5xx, for example.
func NewRetryTransport(rt http.RoundTripper, options ...RetryTransportOption) *RetryTransport {
// Default to no retry if none is provided
defaultErrors := map[int]bool{
500: true,
dcfranca marked this conversation as resolved.
Show resolved Hide resolved
502: true,
503: true,
504: true,
}
rlt := &RetryTransport{transport: rt, retryDelay: time.Second, maxRetries: 0, retryableErrors: defaultErrors}

for _, opt := range options {
opt(rlt)
}

return rlt
}

func (t *RetryTransport) RoundTrip(req *http.Request) (*http.Response, error) {
var err error
var resp *http.Response
var dataBuffer *bytes.Reader

for retry := 0; retry <= t.maxRetries; retry++ {
// Reset the body
// Code from httpretry (https://github.com/ybbus/httpretry/blob/master/roundtripper.go#L60)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you talk to me a little bit about this code and why you chose it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was this tested?

Copy link
Contributor Author

@dcfranca dcfranca Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests I have added on transport_test.go were failing due to the body not being reset after the first retry, then I checked the package I use on some of my projects to retry (httpretry) and saw that they have a code for this scenario, so I shameless copied it and the tests succeed

I didn't figure out how to test the specific scenario of using the function GetBody, the original library also only tests without the GetBody, but if you have an idea on how to test with it I can update the tests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a manual workload you can test this on that will reliably trigger a 500? I'm hesitant to merge/release to a wide audience without a bit more description of the validation process.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a Flux integration that quite often gives me a 500 (a few times a day), I can try linking my branch instead of the default provider there... however, I have never integrated an in development provider, so I will appreciate anything that can guide me through this
I'm also a bit afraid of doing that on this workload, it is not critical, but I would expect there are more changes here that are not tested/stable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can create the binary for this provider and point Terraform towards it by seeing our debugging instructions.

We do not have successful automated integration testing at the moment so we've been over-relying on manual testing. I don't feel super comfortable merging without manually testing this extensively first.

// if request provides GetBody() we use it as Body,
// because GetBody can be retrieved arbitrary times for retry
if req.GetBody != nil {
bodyReadCloser, _ := req.GetBody()
req.Body = bodyReadCloser
} else if req.Body != nil {

// we need to store the complete body, since we need to reset it if a retry happens
// but: not very efficient because:
// a) huge stream data size will all be buffered completely in the memory
// imagine: 1GB stream data would work efficiently with io.Copy, but has to be buffered completely in memory
// b) unnecessary if first attempt succeeds
// a solution would be to at least support more types for GetBody()

// store it for the first time
if dataBuffer == nil {
data, err := io.ReadAll(req.Body)
req.Body.Close()
if err != nil {
return nil, err
}
dataBuffer = bytes.NewReader(data)
req.ContentLength = int64(dataBuffer.Len())
req.Body = io.NopCloser(dataBuffer)
}

// reset the request body
dataBuffer.Seek(0, io.SeekStart)
}

resp, err = t.transport.RoundTrip(req)
if resp != nil && !t.retryableErrors[resp.StatusCode] {
return resp, err
}

time.Sleep(t.retryDelay)
}

return resp, err
}

// WithMaxRetries is used to set the max number of retries when encontrering an error
dcfranca marked this conversation as resolved.
Show resolved Hide resolved
func WithMaxRetries(d int) RetryTransportOption {
return func(rt *RetryTransport) {
rt.maxRetries = d
}
}

// WithRetryableErrors is used to set status codes to retry
func WithRetryableErrors(d map[int]bool) RetryTransportOption {
return func(rt *RetryTransport) {
rt.retryableErrors = d
}
}

// WithRetryDelay is used to set the delay between requests for retrying
func WithRetryDelay(d time.Duration) RetryTransportOption {
return func(rt *RetryTransport) {
rt.retryDelay = d
}
}
Loading