Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP 502 errors not causing exception and retry #1679

Closed
gefair opened this issue Dec 14, 2015 · 3 comments
Closed

HTTP 502 errors not causing exception and retry #1679

gefair opened this issue Dec 14, 2015 · 3 comments
Labels
bug This issue is a bug.

Comments

@gefair
Copy link

gefair commented Dec 14, 2015

I am trying to use AWS CLI through a proxy. Things are generally working.

Occasionally, this proxy will return a 502 error, which does not get translated into an exception in the call to http_session.send() from _get_response() in botocore\endpoint.py. Therefore, the retry mechanisms never get attempted.

It appears that a call to http_response.raise_for_status() needs to be made immediately after the send() call. However, even if I add that to the code, the resulting exception doesn't get re-tried for reasons I don't fully understand. The exception ends up getting re-throw in ExceptionRaiser _check_caught_exception() in botocore\retryhandler.py

Am I doing something wrong? Does my raise_for_status() fix make sense, and if so what additional code changes or config edit are needed so that the HTTPStatusCodeChecker causes a retry?

@gefair gefair changed the title HTTP errors not causing exception HTTP 502 errors not causing exception and retry Dec 14, 2015
@rayluo
Copy link
Contributor

rayluo commented Dec 14, 2015

Issue confirmed. It is because CLI only retries for these HTTP errors: 400, 500, 503, 509.
A quick workaround is to add the 502 error into your botocore/data/_retry.json like this:

$ git diff
diff --git a/botocore/data/_retry.json b/botocore/data/_retry.json
index 90eae9b..d0a94ba 100644
--- a/botocore/data/_retry.json
+++ b/botocore/data/_retry.json
@@ -28,6 +28,13 @@
         }
       }
     },
+    "bad_gateway": {
+      "applies_when": {
+        "response": {
+          "http_status_code": 502
+        }
+      }
+    },
     "service_unavailable": {
       "applies_when": {
         "response": {
@@ -54,6 +61,7 @@
       "policies": {
           "general_socket_errors": {"$ref": "general_socket_errors"},
           "general_server_error": {"$ref": "general_server_error"},
+          "bad_gateway": {"$ref": "bad_gateway"},
           "service_unavailable": {"$ref": "service_unavailable"},
           "limit_exceeded": {"$ref": "limit_exceeded"},
           "throttling_exception": {"$ref": "throttling_exception"},

We may consider to add this into our code base too.

@gefair
Copy link
Author

gefair commented Dec 15, 2015

Thank you for fast response.

This fix worked with one code change. I started with a clean enlistment (I reverted my http_response.raise_for_status() edit I mentioned in my original post.) The one code change I did need was to change the "except ResponseParserError as e:" --> "except Exception as e:" on line 687 in parsers.py. The reason for this change was the 502 response was accompanied with an XML-looking HTML response body from our proxy.

I'm including the response body and resulting stack trace it caused. I'm not sure if you'd prefer to fix as I did, or add a separate except block for the catch-all.

502 error from proxy.txt
parser stack trace.txt

Lastly, I'd like to comment that when errors like these occurs, the multi-part-upload gets orphaned. It isn't obvious to the user and the user gets billed for the uploaded, pending parts. It seems AWSCLI should abort() the multi-part-upload on any failures.

Also, some way to view/clear pending multi-part uploads from AWSCLI would be a great feature.

Shall I open a new issue for this?

@jamesls jamesls added bug This issue is a bug. and removed confirmed labels Mar 24, 2016
jamesls added a commit to jamesls/botocore that referenced this issue Mar 24, 2016
Last part of the fix required for aws/aws-cli#1679.
@jamesls
Copy link
Member

jamesls commented Mar 24, 2016

I believe the two PRs referenced above should fix these issues. Closing out issue. Let us know if you're still seeing issues.

@jamesls jamesls closed this as completed Mar 24, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug.
Projects
None yet
Development

No branches or pull requests

3 participants