-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fixes #926 - expose retry information in exceptions #965
Conversation
Current coverage is 97.32% (diff: 100%)@@ develop #965 diff @@
==========================================
Files 44 44
Lines 7140 7147 +7
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 6949 6956 +7
Misses 191 191
Partials 0 0
|
@@ -155,6 +155,11 @@ def _send_request(self, request_dict, operation_model): | |||
request_dict, operation_model) | |||
success_response, exception = self._get_response( | |||
request, operation_model, attempts) | |||
# this *should* be None or a tuple | |||
if isinstance(success_response, type(())) and len(success_response) > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better if you just check for success_response == None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do, I just wasn't sure how many assumptions I could make about success_response.
I think I would rather have a list of reasons we retried instead of the number of attempts. Something like: success_response['ResponseMetadata']['Retries'] = [
{
'Code': '400',
'Message': 'Throttling'
},
{
'Code': '500',
'Message': 'Server Error'
}
] It gives the caller more information about why we retried. I'd like to see what others think. |
@JordonPhillips updated per your suggestions |
Awesome! Two more minor bits: Could you please squash your commits? Also, you seem to have changed the file permissions on |
With this new change, is I'd also like to see tests that show that we created the list/dict after retrying. |
@jamesls Ah, I didn't catch that. Yeah, I think having |
Ok. Sorry for leaving this for a few days, but I'll try and work up those changes before the end of the week. |
@JordonPhillips perms fixed. I'll re-push and squash once you're both OK with my changes - I really don't like rewriting or discarding history until I know I'm done with it. (Side note - not sure what workflow you use, but you know you can squash merge through the PR UI now...) @jamesls Ok, I'll add back the boolean. Sorry for my confusion. I'll do my best on the tests, but frankly, I'm horribly confused by how the testing is done in botocore, and it seems that Also, FWIW, if you both really want it removed I'll do it, but I think that knowing the max attempts value can be useful for some people, and isn't a lot of overhead to keep in place. For people who are having issues like me, with throttling/rate limiting, it's very useful information to log. |
@JordonPhillips @jamesls I've added a test for the retry errors list, though it's not exactly the style of the other tests. Apologies, but I spent the better part of the last hour trying to make sense of how the rest of the endpoint tests are written, but it's really outside my experience. If this isn't sufficient, I'll rework it but need some guidance. If this is good, I'll squash down all the commits. |
@jantman I think that max retries is only really useful when in development. I'm not so sure that it's useful during normal execution. What is your use case? |
@JordonPhillips My use case was based on #882 / #891 and the assumption that at some point the number of retries will be configurable by the client. At my organization, we have a relatively small number of AWS accounts, and a LOT of applications deployed in them. During the day, it's not unusual to have dozens, perhaps hundreds, of people or applications making API requests simultaneously. I feel like it's pretty likely, at least at my organization and other heavy AWS users, that (once #882 is fixed) I'd probably make it normal practice to log (albeit at debug-level) the number of failures and the maximum number of retries, for use when debugging issues like this. If you think it's too internal/too deep in the code, I can remove it and update the tests. |
@JordonPhillips @jamesls Do you feel that I should remove the max retries? |
@jantman I'm in favor of removing it until it is configurable. |
Will do. I'll update the PR later today. |
@JordonPhillips max retries has been removed. If everyone's good with this, I'll squash it down to one commit and re-push. |
@JordonPhillips @jamesls is this good to squash down? |
@jamesls @JordonPhillips ping? |
935c3a2
to
b44d22b
Compare
@jamesls @JordonPhillips ok, I've re-pushed this branch squashed down to one commit. If needed, the full history is still intact at https://github.com/jantman/botocore/tree/issue926_retry_info_unsquashed |
@jamesls @JordonPhillips It's been a month and a half... can I please get some feedback on the status of this PR? |
Sorry for the delay, looking at this now. |
Spoke with @kyleknap and @JordonPhillips about this and there's a few things worth noting:
For people that want more insight into what's being retried, I think it makes sense to update our events so that people can write custom event handlers that can track retry attempts however they see fit, but that would be a separate pull request. So to summarize:
If those changes seem reasonable, I'll go ahead and pull in your changes, make the updates above, and resend the pull request. Does that seem reasonable? |
@jamesls Apologies for the delay on my end, I was out of the office today. While I think the information could be valuable, I'll agree that the list of retries in response metadata could potentially add a lot to that dict, for an edge case. I suppose if I take a moment to think about it, while my edge case would find it useful, I can understand as a maintainer of boto not to want to include that. This all sounds reasonable to me. While my initial concern had been to aid in tuning max retries (i.e. via a Unless you get to it first, I'll make those changes and push back up sometime tomorrow morning or afternoon GMT-4. |
Good to hear. Thanks again for the pull request. |
This is my attempt at fixing #926 by providing information on retries in
ResponseMetadata
, which is returned as part of the exceptions that are raised to the user.An example of this in action: