Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize send path error handling #12472

Merged

Conversation

rockwotj
Copy link
Contributor

Travis mentioned in slack that ssx::handle_shutdown_errors was showing up in profiles.

This attempts to address that problem by creating less continuations when dispatching sends in the rpc transport layer. Additionally, switch ssx::handle_shutdown_errors to be a corountine.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.2.x
  • v23.1.x
  • v22.3.x

Release Notes

  • none

Create less of a continuation chain by handling all the exceptions in a
single coroutine.

Signed-off-by: Tyler Rockwood <[email protected]>
@rockwotj
Copy link
Contributor Author

rockwotj commented Aug 8, 2023

CI Failure was #12291 which is now fixed.

@rockwotj
Copy link
Contributor Author

rockwotj commented Aug 8, 2023

/ci-repeat 1

@rockwotj
Copy link
Contributor Author

rockwotj commented Aug 8, 2023

.handle_exception_type([](const seastar::broken_promise&) {})
.handle_exception_type([](const seastar::broken_condition_variable&) {});
try {
co_await std::move(fut);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good since handle_exception_type is relatively expensive whether an exception occurs or not, but we this transformation means we only take the exception matching cost when an exception is actually thrown.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that was the intent of this transformation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my reading of https://github.com/redpanda-data/seastar/blob/245e0ccfa6d58d7e0dca2b4034ce1bc43e39bdc5/include/seastar/core/future.hh#L1858-L1862

It seems like before we had 5 continuations and 5 try catch blocks, and now we only have 1 of each (I think, I haven't fully wrapped my brain around how a compiler transforms coroutines). That was the intent here to cut down on both of those in a hope to lessen the cost of this function you noticed in the profiles.

travisdowns
travisdowns previously approved these changes Aug 8, 2023
Copy link
Member

@travisdowns travisdowns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 366 to 382
if (!_dispatch_gate.try_enter()) {
return;
}
ssx::background = ssx::ignore_shutdown_exceptions(do_dispatch_send())
.then_wrapped([this](ss::future<> fut) {
if (fut.failed()) {
vlog(
rpclog.info,
"Error dispatching socket write:{}",
fut.get_exception());
_probe->request_error();
fail_outstanding_futures();
}
// Now that we've handled errors it's safe to
// release the gate.
_dispatch_gate.leave();
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a slight preference to use raii gate holder rather than manual enter/leave. you can move that into the then_wrapped continuation as as capture, or dangle it off the end in a finally block capture.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. dispatch_send shouldn't throw, which is why I manually used try_enter and leave, but it seems we still can check if the gate is closed before holding the gate open.

Remove the double dose of ssx::handle_shutdown_exceptions by inlining
the gate handling into a single place.

Remove the extra overhead of some lambdas and a couple of extra
continuations by handling unexpected errors and closing the gate in a
single continuation.

Signed-off-by: Tyler Rockwood <[email protected]>
@dotnwat dotnwat merged commit 7ff6bad into redpanda-data:dev Aug 15, 2023
25 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v22.3.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-12472-v22.3.x-824 remotes/upstream/v22.3.x
git cherry-pick -x 9670361db3d5c8a514a1984512300d17fa1122ec 0b3fa8c05e5d4a88b3ba05a47f92ce679c556159

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-12472-v23.1.x-559 remotes/upstream/v23.1.x
git cherry-pick -x 9670361db3d5c8a514a1984512300d17fa1122ec 0b3fa8c05e5d4a88b3ba05a47f92ce679c556159

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants