Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry on 5xx, error on 4xx #465

Merged
merged 4 commits into from
Nov 1, 2021
Merged

Retry on 5xx, error on 4xx #465

merged 4 commits into from
Nov 1, 2021

Conversation

rylev
Copy link
Contributor

@rylev rylev commented Oct 29, 2021

An attempt to fix #464

The policy now works as following:

  • on 2xx or 3xx status codes, returns Ok(response)
  • on 4xx status codes, returns Err(HttpError::UnexpectedStatusCode)
  • on 5xx status codes, logs that there was an error and retries (after the specific policies designated retry time)
  • on other error (e.g., no network, DNS error, etc.), retries (after retry time)

Questions:

  • Does the above policy look good?
  • Should we error on 3xx status codes? The underlying http client should follow redirects (though we should test this) so in theory we should never see 3xx status codes.
  • Once we know we have a 2xx status code, does validating the status code make any sense? I assume we don't really care if an endpoint returns a 201 instead of a 200 for example.

Copy link
Contributor

@MindFlavor MindFlavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, the retry policy is really dumb right now!

I suggest just a little change in logic to align our code to the one of the C# SDK.

sdk/core/src/policies/retry_policies/retry_policy.rs Outdated Show resolved Hide resolved
log::error!("server returned error 500 status: {}", status);
Box::new(response.validate(StatusCode::OK).await.unwrap_err())
}
Err(error) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C# SDK retries IOExceptions so we should retry here too (assuming we can discriminate the error to errors raised by the http policy).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you saying we should only retry IO related errors? Right now any error is retried.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry, that was what I meant... 😔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I can tackle that in another PR. This might require that we change the pipeline to not work with Box<dyn Error> but with azure_core::Error instead.

@rylev rylev requested a review from MindFlavor October 29, 2021 14:06
@rylev
Copy link
Contributor Author

rylev commented Oct 29, 2021

@MindFlavor @heaths I've made a change to this PR where we now treat any 2xx (and for that matter any 3xx or 1xx if those should ever happen to be returned) as successes. We are now validating in the retry policy that we get a successful response and if not either retry (if it's a response code we hard code as being a retry status code) or immediately return an error.

Copy link
Contributor

@MindFlavor MindFlavor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! Thank you! ❤️

}
Err(error) => {
log::debug!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 67 uses log::error for the same message. Should these be consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be retried (and hopefully succeed on retry) where as line 67 will be returned to the user as an error. Therefore, line 67 represents an error that the user will definitely handle while hopefully this line is just a transient state.

@rylev rylev merged commit c6f94bc into Azure:main Nov 1, 2021
@rylev rylev deleted the retry-on-500 branch November 1, 2021 08:26
@heaths heaths added the Azure.Core The azure_core crate label Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Azure.Core The azure_core crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Retry policy unable to react to unexpected HTTP status codes
4 participants