-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud_storage: Handle nested benign exception #10909
cloud_storage: Handle nested benign exception #10909
Conversation
57706bf
to
6e20786
Compare
ci failure is #10497 |
I understand that the original failure was seastar::abort_requested_exception throw while handling seastar::abort_requested_exception, is gate_closed_exception expected to happen in the same way, or is it just future proofing? |
If a nested exception is thrown where either the inner or outer exception is abort requested or gate closed, this is logged as a debug level error. If the nested exception is of any other kind, an error level log is generated.
6e20786
to
69b4c69
Compare
Mostly future proofing, yes. I assumed a nested gate closed can happen in the same way as a nested abort can happen and will need to be handled the same way. |
@@ -67,6 +88,14 @@ error_outcome handle_client_transport_error( | |||
} catch (const ss::abort_requested_exception&) { | |||
vlog(logger.debug, "Abort requested"); | |||
throw; | |||
} catch (const ss::nested_exception& ex) { | |||
if (has_abort_or_gate_close_exception(ex)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, do you know where this was being thrown? Just want to make sure we need to worry about unwrapping the other exception types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also curious about this. is this something we all need to be on the look out for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific instance was during a put object request handled by the s3 client:
WARN 2023-05-11 22:30:33,807 [shard 0] s3 - s3_client.cc:681 - S3 request failed with error: seastar::nested_exception: seastar::abort_requested_exception (abort requested) (while cleaning up after seastar::abort_requested_exception (abort requested))
ERROR 2023-05-11 22:30:33,807 [shard 0] s3 - util.cc:71 - Unexpected error seastar::nested_exception: seastar::abort_requested_exception (abort requested) (while cleaning up after seastar::abort_requested_exception (abort requested))
In the same time frame there were several other put requests which seem to have failed but the error logging correctly logged them as DEBUG but the nested instance got logged as error:
[abhijat@fedora Downloads]$ grep -E 'util.*(Abort r|abort_r)' r.log
DEBUG 2023-05-11 22:30:33,796 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,796 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,801 [shard 0] s3 - util.cc:68 - Abort requested
ERROR 2023-05-11 22:30:33,807 [shard 0] s3 - util.cc:71 - Unexpected error seastar::nested_exception: seastar::abort_requested_exception (abort requested) (while cleaning up after seastar::abort_requested_exception (abort requested))
DEBUG 2023-05-11 22:30:33,812 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,817 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,825 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,825 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,825 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,832 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,838 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,838 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,842 [shard 0] s3 - util.cc:68 - Abort requested
AFAIK the issue here is how we classify and log exceptions in handle_client_transport_error
so it should be enough to handle this specifically in this function?
Just want to make sure we need to worry about unwrapping the other exception types
Could you expand on this? Do you mean we should be handling the nested types differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to make sure we need to worry about unwrapping the other exception types
Could you expand on this? Do you mean we should be handling the nested types differently?
@abhijat thanks for the response. to clarify my question: might we have other places where a nested exception is being thrown but not handled correctly by unwrapping its inner/outer components.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dotnwat initially as mentioned in my comment here I assumed that we only need to handle this in the cloud storage client where we classify log levels, but now thinking more about it, anywhere we handle abort requested or gate closed exceptions differently from general exceptions, we may also need to examine nested exceptions to see if they should be classified as abort requested/gate closed.
This pattern is in a few places in the tree (for example, searching for handle_exception_type([](const ss::abort_requested_exception
yields a few results) so I suspect there are more places where we should be unwrapping nested exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙏
/ci-repeat |
/backport v23.1.x |
If a nested exception is seen during an HTTP call where either the inner or outer exception is abort or gate closed, then the exception is logged on debug level, to be consistent with how these exception types (abort/gate closed) are handled in isolation.
If both the components of the nested exception are of any other kind, an error is logged.
Fixes #10706
Backports Required
Release Notes