Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

503s from S3Client.headObject() are not correctly identified as throttled exceptions #5414

Open
1 task done
drewschleit opened this issue Jul 19, 2024 · 1 comment
Open
1 task done
Labels
bug This issue is a bug. needs-review This issue or PR needs review from the team. p2 This is a standard priority issue

Comments

@drewschleit
Copy link

Upcoming End-of-Support

  • I acknowledge the upcoming end-of-support for AWS SDK for Java v1 was announced, and migration to AWS SDK for Java v2 is recommended.

Describe the bug

When a call to S3Client.headObject() fails with a 503 Slow Down error, I observe that for the resulting exception, S3Exception.isThrottlingException() returns false. For 503 failures with other APIs such as .getObject(), it returns true.

The isThrottlingException() method is used as part of the retry strategy: when set to LEGACY mode, throttling exceptions do not consume from the token bucket. The impact of this bug is that, even when setting a high number of retries to persistently retry throttled exceptions (with appropriate backoff settings, of course), I still see frequent failures after only a few retries due to token bucket exhaustion.

In my particular usecase, I'm executing an Apache Iceberg workload which executes a large number of headObject() requests, and the job is failing due to retry exhaustion despite having set a large number of the maximum number of retries. I imagine other big data workloads which extensively use this API could see the same behavior.

Expected Behavior

When a call to S3Client.headObject() fails with a 503 Slow Down error, I expect that S3Exception.isThrottlingException() returns true.

Current Behavior

When a call to S3Client.headObject() fails with a 503 Slow Down error, I observe that S3Exception.isThrottlingException() returns false.

Reproduction Steps

I only reproduced this in my-at scale application making a large number of requests to S3.

After enabling wire logging, I observe that S3's raw response is as follows. I imagine that the issue can be reproduced by mocking this response.

24/07/19 21:37:04 DEBUG wire: http-outgoing-2925 << "HTTP/1.1 503 Slow Down[\r][\n]"
24/07/19 21:37:04 DEBUG wire: http-outgoing-2925 << "x-amz-request-id: DNJ0YBW4S9H9X8DP[\r][\n]"
24/07/19 21:37:04 DEBUG wire: http-outgoing-2925 << "x-amz-id-2: vkQITUSJv6LxBRzJkgy+5stqWmlS7+L/dW41DhlDmXStNpxtWBO+WRKYDSWhXo/C5YmDYmm0AaX/Cc532WgTWaaM4DB7d36a[\r][\n]"
24/07/19 21:37:04 DEBUG wire: http-outgoing-2925 << "Content-Type: application/xml[\r][\n]"
24/07/19 21:37:04 DEBUG wire: http-outgoing-2925 << "Date: Fri, 19 Jul 2024 21:37:03 GMT[\r][\n]"
24/07/19 21:37:04 DEBUG wire: http-outgoing-2925 << "Server: AmazonS3[\r][\n]"
24/07/19 21:37:04 DEBUG wire: http-outgoing-2925 << "Connection: close[\r][\n]"

Note that there is no XML body provided.

Here's what a logging of the exception looks like.
Code

LOG.error(
            "Got service exception. Is throttling? {}. Error details: {}.",
            e.isThrottlingException(),
            e.awsErrorDetails(),
            e);

Log output

Got service exception. Is throttling? false. Error details: AwsErrorDetails(serviceName=S3). 
software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status Code: 503, Request ID: DNJ9ZP64AW5HP5ZT, Extended Request ID: uBunIlZ0ytEiYNOyt7KND7OOpngDTjsSrYKveakQTyxO80MX0sHVOxLuu6jnbBSQlUq53yUkCiKPuknvXvOAW4ewXKhquD4P)

Here, notice that

  1. isThrottlingException() returned false
  2. AwsErrorDetails.errorCode wasn't printed, so it must be null
  3. The first field of the exception text is null, which is another field that's pulled from the error XML.

Possible Solution

For HEAD requests, S3 does not provide an error XML. I speculate that this is the problem. From a cursory reading of the code, it appears that the data source for AwsServiceException.isThrottlingException() is awsErrorDetails.errorCode(), and this field is derived from the error XML in AwsXmlErrorUnmarshaller.unmarshall(). To solve this problem, the implementation of AwsServiceException.isThrottlingException() would need to look at the HTTP status code when the error code was not provided.

Additional Information/Context

No response

AWS Java SDK version used

2.22.12

JDK version used

8

Operating System and version

EMR Serverless

@drewschleit drewschleit added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jul 19, 2024
@debora-ito
Copy link
Member

Moving to the Java SDK 2.x repo.

@debora-ito debora-ito transferred this issue from aws/aws-sdk-java Jul 19, 2024
@debora-ito debora-ito added the p2 This is a standard priority issue label Jul 19, 2024
@debora-ito debora-ito added needs-review This issue or PR needs review from the team. and removed needs-triage This issue or PR still needs to be triaged. labels Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. needs-review This issue or PR needs review from the team. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

2 participants