Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cloud_storage: Handle nested benign exception #10909

Merged
merged 1 commit into from
Jun 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/v/cloud_storage_clients/tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ rp_test(
backend_detection_test.cc
s3_client_test.cc
xml_sax_parser_test.cc
exception_test.cc
DEFINITIONS BOOST_TEST_DYN_LINK
LIBRARIES v::seastar_testing_main Boost::unit_test_framework v::http v::cloud_storage_clients v::cloud_roles
ARGS "-- -c 1"
Expand Down
40 changes: 40 additions & 0 deletions src/v/cloud_storage_clients/tests/exception_test.cc
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
/*
* Copyright 2023 Redpanda Data, Inc.
*
* Licensed as a Redpanda Enterprise file under the Redpanda Community
* License (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* https://github.com/redpanda-data/redpanda/blob/master/licenses/rcl.md
*/

#include "cloud_storage_clients/util.h"

#include <boost/test/unit_test.hpp>

BOOST_AUTO_TEST_CASE(test_nested_exception) {
{
auto inner = std::make_exception_ptr(ss::abort_requested_exception{});
auto outer = std::make_exception_ptr(ss::gate_closed_exception{});
ss::nested_exception ex{inner, outer};
BOOST_REQUIRE(
cloud_storage_clients::util::has_abort_or_gate_close_exception(ex));
}

{
auto inner = std::make_exception_ptr(std::bad_alloc{});
auto outer = std::make_exception_ptr(ss::gate_closed_exception{});
ss::nested_exception ex{inner, outer};
BOOST_REQUIRE(
cloud_storage_clients::util::has_abort_or_gate_close_exception(ex));
}

{
auto inner = std::make_exception_ptr(std::invalid_argument{""});
auto outer = std::make_exception_ptr(
ss::no_sharded_instance_exception{});
ss::nested_exception ex{inner, outer};
BOOST_REQUIRE(
!cloud_storage_clients::util::has_abort_or_gate_close_exception(ex));
}
}
29 changes: 29 additions & 0 deletions src/v/cloud_storage_clients/util.cc
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,29 @@

#include <boost/property_tree/xml_parser.hpp>

namespace {

bool is_abort_or_gate_close_exception(const std::exception_ptr& ex) {
try {
std::rethrow_exception(ex);
} catch (const ss::abort_requested_exception&) {
return true;
} catch (const ss::gate_closed_exception&) {
return true;
} catch (...) {
return false;
}
}

} // namespace

namespace cloud_storage_clients::util {

bool has_abort_or_gate_close_exception(const ss::nested_exception& ex) {
return is_abort_or_gate_close_exception(ex.inner)
|| is_abort_or_gate_close_exception(ex.outer);
}

error_outcome handle_client_transport_error(
std::exception_ptr current_exception, ss::logger& logger) {
auto outcome = error_outcome::retry;
Expand Down Expand Up @@ -67,6 +88,14 @@ error_outcome handle_client_transport_error(
} catch (const ss::abort_requested_exception&) {
vlog(logger.debug, "Abort requested");
throw;
} catch (const ss::nested_exception& ex) {
if (has_abort_or_gate_close_exception(ex)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, do you know where this was being thrown? Just want to make sure we need to worry about unwrapping the other exception types

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also curious about this. is this something we all need to be on the look out for?

Copy link
Contributor Author

@abhijat abhijat May 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific instance was during a put object request handled by the s3 client:

WARN  2023-05-11 22:30:33,807 [shard 0] s3 - s3_client.cc:681 - S3 request failed with error: seastar::nested_exception: seastar::abort_requested_exception (abort requested) (while cleaning up after seastar::abort_requested_exception (abort requested))
ERROR 2023-05-11 22:30:33,807 [shard 0] s3 - util.cc:71 - Unexpected error seastar::nested_exception: seastar::abort_requested_exception (abort requested) (while cleaning up after seastar::abort_requested_exception (abort requested))

In the same time frame there were several other put requests which seem to have failed but the error logging correctly logged them as DEBUG but the nested instance got logged as error:

[abhijat@fedora Downloads]$ grep -E 'util.*(Abort r|abort_r)' r.log
DEBUG 2023-05-11 22:30:33,796 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,796 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,801 [shard 0] s3 - util.cc:68 - Abort requested
ERROR 2023-05-11 22:30:33,807 [shard 0] s3 - util.cc:71 - Unexpected error seastar::nested_exception: seastar::abort_requested_exception (abort requested) (while cleaning up after seastar::abort_requested_exception (abort requested))
DEBUG 2023-05-11 22:30:33,812 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,817 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,825 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,825 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,825 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,832 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,838 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,838 [shard 0] s3 - util.cc:68 - Abort requested
DEBUG 2023-05-11 22:30:33,842 [shard 0] s3 - util.cc:68 - Abort requested

AFAIK the issue here is how we classify and log exceptions in handle_client_transport_error so it should be enough to handle this specifically in this function?

Just want to make sure we need to worry about unwrapping the other exception types

Could you expand on this? Do you mean we should be handling the nested types differently?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to make sure we need to worry about unwrapping the other exception types

Could you expand on this? Do you mean we should be handling the nested types differently?

@abhijat thanks for the response. to clarify my question: might we have other places where a nested exception is being thrown but not handled correctly by unwrapping its inner/outer components.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dotnwat initially as mentioned in my comment here I assumed that we only need to handle this in the cloud storage client where we classify log levels, but now thinking more about it, anywhere we handle abort requested or gate closed exceptions differently from general exceptions, we may also need to examine nested exceptions to see if they should be classified as abort requested/gate closed.

This pattern is in a few places in the tree (for example, searching for handle_exception_type([](const ss::abort_requested_exception yields a few results) so I suspect there are more places where we should be unwrapping nested exceptions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏

vlog(logger.debug, "Nested abort or gate closed: {}", ex);
throw;
}

vlog(logger.error, "Unexpected error {}", std::current_exception());
outcome = error_outcome::fail;
} catch (...) {
vlog(logger.error, "Unexpected error {}", std::current_exception());
outcome = error_outcome::fail;
Expand Down
2 changes: 2 additions & 0 deletions src/v/cloud_storage_clients/util.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,4 +45,6 @@ std::chrono::system_clock::time_point parse_timestamp(std::string_view sv);
void log_buffer_with_rate_limiting(
const char* msg, iobuf& buf, ss::logger& logger);

bool has_abort_or_gate_close_exception(const ss::nested_exception& ex);

} // namespace cloud_storage_clients::util