Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize thread_worker #13056

Merged
merged 1 commit into from
Aug 31, 2023
Merged

Conversation

rockwotj
Copy link
Contributor

@rockwotj rockwotj commented Aug 28, 2023

Use a ring buffer to bulk pass tasks around, and also support creating a
twin worker thread for every reactor thread.

This amortizes the cost of running a task on an alien thread, this came about because we're going to be switching transforms over to Wasmtime running on an alien thread.

Before:

test                                      iterations      median         mad         min         max      allocs       tasks        inst
thread_worker_test.1                            3354   216.749us    47.062ns   216.590us   216.951us       5.000       4.000   1610844.2
thread_worker_test.10                            460     2.088ms   870.335ns     2.087ms     2.089ms      70.000      58.000  25391981.4
thread_worker_test.100                            61    16.465ms     1.733us    16.461ms    16.469ms     700.000     598.000 208106393.1
thread_worker_test.1000                           50    19.925ms     6.187us    19.919ms    19.937ms    7007.000    5998.000 214776807.2

After:

test                                      iterations      median         mad         min         max      allocs       tasks        inst
thread_worker_test.1                            1855   216.515us    77.552ns   216.330us   216.593us       4.000       4.000   1549639.0
thread_worker_test.10                            413     2.085ms   394.971ns     2.084ms     2.086ms      40.000      40.000  24921287.6
thread_worker_test.100                           113     8.518ms     3.891us     8.513ms     8.524ms     400.000     400.000 106121583.5
thread_worker_test.1000                           93    10.424ms   605.968ns    10.423ms    10.426ms    4880.000    5744.000 127114775.7

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.2.x
  • v23.1.x
  • v22.3.x

Release Notes

  • none

@rockwotj rockwotj marked this pull request as ready for review August 29, 2023 01:30
src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved
src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved
src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved
src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved
@rockwotj rockwotj force-pushed the optimize-thread-worker branch 2 times, most recently from dfab236 to 9e3490d Compare August 29, 2023 14:01
@rockwotj rockwotj requested a review from BenPope August 29, 2023 14:02
Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

src/v/ssx/thread_worker.h Show resolved Hide resolved
src/v/ssx/thread_worker.h Show resolved Hide resolved
dotnwat
dotnwat previously approved these changes Aug 30, 2023
Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

I wonder if it makes sense to add an abort_source, or fail pending requests on stop()?

src/v/ssx/thread_worker.h Outdated Show resolved Hide resolved
Use a ring buffer to bulk pass tasks around, and also support creating a
twin worker thread for every reactor thread.

Before:

```
test                                      iterations      median         mad         min         max      allocs       tasks        inst
thread_worker_test.1                            3354   216.749us    47.062ns   216.590us   216.951us       5.000       4.000   1610844.2
thread_worker_test.10                            460     2.088ms   870.335ns     2.087ms     2.089ms      70.000      58.000  25391981.4
thread_worker_test.100                            61    16.465ms     1.733us    16.461ms    16.469ms     700.000     598.000 208106393.1
thread_worker_test.1000                           50    19.925ms     6.187us    19.919ms    19.937ms    7007.000    5998.000 214776807.2
```

After:

```
test                                      iterations      median         mad         min         max      allocs       tasks        inst
thread_worker_test.1                            1855   216.515us    77.552ns   216.330us   216.593us       4.000       4.000   1549639.0
thread_worker_test.10                            413     2.085ms   394.971ns     2.084ms     2.086ms      40.000      40.000  24921287.6
thread_worker_test.100                           113     8.518ms     3.891us     8.513ms     8.524ms     400.000     400.000 106121583.5
thread_worker_test.1000                           93    10.424ms   605.968ns    10.423ms    10.426ms    4880.000    5744.000 127114775.7
```

Signed-off-by: Tyler Rockwood <[email protected]>
@rockwotj
Copy link
Contributor Author

I wonder if it makes sense to add an abort_source, or fail pending requests on stop()?

I am not really sure how to do this correctly, as I don't think seastar classes are designed to support work outside of seastar, unless there is a different technique you're suggesting.

Same goes for failing pending stuff on stop, I am not sure how to tell the alien thread that it needs to stop, maybe with an atomic boolean it checks before every task? Thoughts?

@BenPope
Copy link
Member

BenPope commented Aug 30, 2023

I wonder if it makes sense to add an abort_source, or fail pending requests on stop()?

I am not really sure how to do this correctly, as I don't think seastar classes are designed to support work outside of seastar, unless there is a different technique you're suggesting.

Same goes for failing pending stuff on stop, I am not sure how to tell the alien thread that it needs to stop, maybe with an atomic boolean it checks before every task? Thoughts?

The stop signal was previously passed via the ss::writeable_eventfd, once stop is detected, I imagine it would be possible to call fail_tasks.

I'm not saying it has to be done, it's just a thought. Propagating an abort (somehow) might be more useful than just cancelling the work. Or it might just make sense to drain the queue; it probably depends what kind of work it's running.

@rockwotj
Copy link
Contributor Author

The stop signal was previously passed via the ss::writeable_eventfd, once stop is detected, I imagine it would be possible to call fail_tasks.

That's a little hard to do now that there are multiple signals in the normal submit path. signal adds values up and then read drains the value from my reading of the man pages, so calling signal(1) twice means that you'll read 2 as the result.

it probably depends what kind of work it's running.

Agreed, I don't know about kerberos, but for wasm, cancelling does make since, but that will have already happened as the transform processor needs to shutdown before wasm does.

@BenPope
Copy link
Member

BenPope commented Aug 31, 2023

it probably depends what kind of work it's running.

Agreed, I don't know about kerberos, but for wasm, cancelling does make since, but that will have already happened as the transform processor needs to shutdown before wasm does.

Kerberos is usually just reading a local file, so I/O, but not too bad, and unlikely to build a large backlog.

Copy link
Member

@BenPope BenPope left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's keep an eye on the queue length at shutdown. Should there be metrics for queue length, time in queue, etc?

Feel free to add a followup issue.

@rockwotj
Copy link
Contributor Author

CI Failures: #12120 and #12659

@rockwotj
Copy link
Contributor Author

LGTM, let's keep an eye on the queue length at shutdown.

I added a followup task for metrics - Also the queue is limited to 128 items for now.

@michael-redpanda michael-redpanda self-assigned this Aug 31, 2023
@rockwotj rockwotj merged commit 603c9bc into redpanda-data:dev Aug 31, 2023
25 checks passed
@rockwotj rockwotj deleted the optimize-thread-worker branch August 31, 2023 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants