kvcoord: Eliminate 1 Go routine from MuxRangeFeed #96756

miretskiy · 2023-02-07T23:34:30Z

Prior to this PR, the server side MuxRangeFeed
implementation spawned a separate Go routine executing
single RangeFeed for each incoming request.

This is wasteful and unnecessary.
Instead of blocking, and waiting for a single RangeFeed to complete,
have rangefeed related functions return a promise to return
a *roachpb.Error once rangefeed completes (future.Future[*roachpb.Error]).

Prior to this change MuxRangeFeed would spin up 4 Go routines
per range. With this PR, the number is down to 3.
This improvement is particularly important when executing
rangefeed against large tables (10s-100s of thousands of ranges).

Informs #96395
Epic: None

Release note (enterprise change): Changefeeds running with
changefeed.mux_rangefeed.enabled setting set to true are
more efficient, particularly when executing against large tables.

cockroach-teamcity · 2023-02-07T23:34:41Z

This change is

andreimatei

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @erikgrinaker, @irfansharif, @miretskiy, and @tbg)

-- commits line 31 at r2:
drive by: I don't think this is a release note that users can understand :P

miretskiy · 2023-02-08T01:14:24Z

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @erikgrinaker, @irfansharif, @miretskiy, and @tbg)

-- commits line 31 at r2: drive by: I don't think this is a release note that users can understand :P

rephrased.

erikgrinaker · 2023-02-08T10:58:54Z

I'm curious if we measured the impact of this change? I didn't think these goroutines would be considered for scheduling until the channel send, so I'm curious where the cost comes from.

miretskiy · 2023-02-08T11:13:18Z

I'm curious if we measured the impact of this change? I didn't think these goroutines would be considered for scheduling until the channel send, so I'm curious where the cost comes from.

I did few tests. 75k ranges on 5 nice e cluster (kV splits 75k). Even without any workload, you could see impact on go scheduler. You are right, of course, that nothing should happen until send. But of course, even without work happening on those ranges, each one still had checkpoints - one every 200ms

erikgrinaker · 2023-02-08T11:19:45Z

I'm curious if we measured the impact of this change? I didn't think these goroutines would be considered for scheduling until the channel send, so I'm curious where the cost comes from.

I did few tests. 75k ranges on 5 nice e cluster (kV splits 75k). Even without any workload, you could see impact on go scheduler. You are right, of course, that nothing should happen until send. But of course, even without work happening on those ranges, each one still had checkpoints - one every 200ms

Sure, but I'm specifically talking about the goroutines we're removing in this PR -- do they actually cost anything, or are they ~free? They're only waiting for the final error result from the registration, right, so they shouldn't be scheduled until the rangefeed terminates?

miretskiy · 2023-02-08T11:31:15Z

Sure, but I'm specifically talking about the goroutines we're removing in this PR -- do they actually cost anything, or are they ~free? They're only waiting for the final error result from the registration, right, so they shouldn't be scheduled until the rangefeed terminates?

I can post some info, while this is getting reviewed. Roughly free is not the same as free, esp since you have so many. I tested with another PR which removes 1 more go routine on client side, but that change is a lot more disruptive, so I pulled it out for this PR; but I will get updated numbers.
Conceptually, I think this PR should help.
I'm not a go scheduler expert; I do suspect that even though these go routines are not active, all of them are in runnable state; many go routines in runnable state will slow down the scheduler since it has to work harder to select next go routine to run. Fewer go routines is better, in my opinion, even if each one is roughly free

erikgrinaker · 2023-02-08T11:46:57Z

I'm not a go scheduler expert; I do suspect that even though these go routines are not active, all of them are in runnable state; many go routines in runnable state will slow down the scheduler since it has to work harder to select next go routine to run.

I don't think they are. They should be in the blocked state until someone sends on the channel, at which point the sending goroutine will mark them as runnable: https://codeburst.io/diving-deep-into-the-golang-channels-549fd4ed21a8.

miretskiy · 2023-02-08T12:55:56Z

I'm not a go scheduler expert; I do suspect that even though these go routines are not active, all of them are in runnable state; many go routines in runnable state will slow down the scheduler since it has to work harder to select next go routine to run.

I don't think they are. They should be in the blocked state until someone sends on the channel, at which point the sending goroutine will mark them as runnable: https://codeburst.io/diving-deep-into-the-golang-channels-549fd4ed21a8.

As I said, not a go scheduler expert. It still seems to me that fewer resources used is better, even if resource is cheap.

miretskiy · 2023-02-08T12:56:15Z

I'll post some benches later

tbg

Cursory first review only to get my bearing, thank you for giving this code some attention (I know REPL nominally owns it so double thanks!). Generally speaking I think the introduction of the Future makes sense - even if this extra goroutine doesn't weigh heavily on the scheduler, it still seems like a useful pattern to introduce. I'm not sold, however, on the "Promise" which is also not used - I'd much prefer we removed that.

I was wondering if this "flattening" could complicate the work we expect we have to do regardless, to avoid the periodic "pulsing" of the goroutines via the closed timestamp interval. I don't think so, which is good.

Could you list the goroutines before and after in the commit message? I think that would be instructive for most readers including myself. I can see that we're saving a goroutine in rangeFeedWithRangeID (previously on <-errC). We have (*registration).outputLoop. We have the incoming client goroutine, which will now sit on the promise directly (skipping the <-errC). And then we have a goroutine in gRPC servicing the streaming RPC.

Is anything conceptually in the way of eliding outputLoop as well, by hooking it up directly to the registration? Instead of returning a Future to the client goroutine, we'd return - essentially - the *registration itself and task the client goroutine with servicing it.

pkg/util/future/future.go

Exploration that might be useful for cockroachdb#96756.

tbg · 2023-02-13T10:39:48Z

Ah, I meant to say, curious to see the overhead as well. I played around with #97028

and got (on my M1 Mac)

go run ./pkg/cmd/goroutinepulser 100000 200ms
[...]
avg	p50	p75	p90	p99	p99.9	p99.99	pMax
6.87	6.52	9.65	12.57	p16.69	19.90	21.89	31.46 [cum ms]

go run ./pkg/cmd/goroutinepulser 75000 200ms
[...]
avg	p50	p75	p90	p99	p99.9	p99.99	pMax
6.63	6.42	8.98	11.45	p18.17	26.25	29.23	31.46 [cum ms]

so here a 25% reduction isn't dramatic (plus Erik's point about the goroutines that are being reduced never becoming runnable).

I'm curious to see how a "real" CockroachDB cluster will do.

miretskiy · 2023-02-16T21:31:58Z

Cursory first review only to get my bearing, thank you for giving this code some attention (I know REPL nominally owns it so double thanks!). Generally speaking I think the introduction of the Future makes sense - even if this extra goroutine doesn't weigh heavily on the scheduler, it still seems like a useful pattern to introduce. I'm not sold, however, on the "Promise" which is also not used - I'd much prefer we removed that.

I am actually using promise quite extensively in the latest version I just pushed.

I was wondering if this "flattening" could complicate the work we expect we have to do regardless, to avoid the periodic "pulsing" of the goroutines via the closed timestamp interval. I don't think so, which is good.

I don't think so; but I've been wrong before.

Could you list the goroutines before and after in the commit message? I think that would be instructive for most readers including myself. I can see that we're saving a goroutine in rangeFeedWithRangeID (previously on <-errC). We have (*registration).outputLoop. We have the incoming client goroutine, which will now sit on the promise directly (skipping the <-errC). And then we have a goroutine in gRPC servicing the streaming RPC.

it only removes go routine started by MuxRangefeed to run underlying single rangefeed.

Is anything conceptually in the way of eliding outputLoop as well, by hooking it up directly to the registration? Instead of returning a Future to the client goroutine, we'd return - essentially - the *registration itself and task the client goroutine with servicing it.

Yes, we should do that. We start off w/ 5 go routines for regular rangefeed; we are down to 4 w/ Mux; this PR brings it down to 3.
Then, rewrite client portion of mux -- that's another go routine per range.
And then finally, drop output loop.
Everything should move into O(num nodes) instead of O(num ranges).

miretskiy · 2023-02-16T21:40:13Z

@erikgrinaker @tbg
I got some benchmarks. As you correctly point out, @erikgrinaker , these go routines this PR removes should be roughly
free. And they are. But they are not entirely free. The setup is 5 node, n32-standard machine cluster; KV workload initialized with 100k splits -- so 20k ranges per node. It's a bit too much, but good for this test to see the impact
of those Go routines.

The biggest savings of this PR come from the startup costs:

You can clearly tell a difference in runnable count between up to 160(!) on the left -- this was running master version of MuxRF, and on the right -- which is running this change.

You can also (obviously) see the total number of Go routines go down from 400k to 300k:

More interestingly, is the 99.99 impact on sql latency:

So, to summarize this PR: it eliminates 1 Go routine from MuxRangefeed. This Go routine used to be idle, so it was mostly free. However, at the start of rangefeed, that extra go routine wants to run -- and it's this extra Go routine that would make latency worse -- simply by creating more work for the Go scheduler.

To be very clear: there are other ways to solve it -- perhaps by putting some sort of a rate limit on the creation of those extra go routines. But again, I just don't see why we needed them in the first place.

tbg · 2023-02-20T20:46:57Z

Thanks for running the experiments! Feel free to request a review once the conflicts have been rebased away and it's ready for an in-depth review.

erikgrinaker · 2023-02-21T12:40:52Z

Good point, spawning all of these in a loop will put a fair bit of load on the scheduler. I was mostly thinking about the individual rangefeed approach where the goroutine is already spawned by gRPC, not MuxRangeFeed where we don't need to spawn them in the first place. Let's get this rebased and wrapped up, and we'll review it. Thanks!

miretskiy · 2023-03-02T17:04:09Z

@tbg, @erikgrinaker -- all is green in the land of CI with all comments addressed.

erikgrinaker

Basically LGTM, but let's resolve the comments first. Thanks a lot for taking this on!

pkg/kv/kvserver/rangefeed/metrics.go

pkg/kv/kvserver/rangefeed/registry.go

pkg/util/future/future.go

pkg/util/future/future_test.go

pkg/server/node.go

pkg/kv/kvserver/stores.go

pkg/kv/kvserver/replica_rangefeed.go

Add a gauge metric keep track of currently active registrations. Epic: None Release note: None

miretskiy · 2023-03-13T21:21:56Z

@erikgrinaker -- updates pushed. Comments addressed, hopefully.

miretskiy · 2023-03-14T11:25:33Z

Not really. There is wait method, there is Done. The caller can differentiate

…

On Tue, Mar 14, 2023, 7:01 AM Erik Grinaker ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In pkg/util/future/future.go <#96756 (comment)> : > + return AwaitableFuture[T]{Future: f, done: closedChan} + } + + done := make(chan struct{}) + f.WhenReady(func(v T) { + close(done) + }) + + return AwaitableFuture[T]{Future: f, done: done} +} + +// Get blocks until future is ready and returns it. +// This method blocks unconditionally. If the caller needs +// to respect context cancellation, use Done() method to select +// on, followed by Get. +func (f AwaitableFuture[T]) Get() T { Should Get() also return a bool specifying whether it's ready? The zero value can be ambiguous, e.g. if storing an int. — Reply to this email directly, view it on GitHub <#96756 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ANA4FVF6ZI6UZR4POHWPTSLW4BFZFANCNFSM6AAAAAAUUSQBTQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

miretskiy · 2023-03-14T13:23:53Z

bors r=erikgrinaker

miretskiy · 2023-03-14T13:28:55Z

bors r-

craig · 2023-03-14T13:28:58Z

Canceled.

Add a library implementing promise/future abstraction. `Future[T]` describes a value of type T, which will become available in the future. The caller then may wait until future becomes available. Release note: None

Prior to this PR, the server side `MuxRangeFeed` implementation spawned a separate Go routine executing single RangeFeed for each incoming request. This is wasteful and unnecessary. Instead of blocking, and waiting for a single RangeFeed to complete, have rangefeed related functions return a promise to return a `*kvpb.Error` once rangefeed completes (`future.Future[*kvpb.Error]`). Prior to this change MuxRangeFeed would spin up 4 Go routines per range. With this PR, the number is down to 3. This improvement is particularly important when executing rangefeed against large tables (10s-100s of thousands of ranges). Informs cockroachdb#96395 Epic: None Release note (enterprise change): Changefeeds running with `changefeed.mux_rangefeed.enabled` setting set to true are more efficient, particularly when executing against large tables.

miretskiy · 2023-03-14T15:54:51Z

bors r+

craig · 2023-03-14T16:56:29Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

miretskiy · 2023-03-14T17:36:28Z

bors r+

craig · 2023-03-14T17:36:32Z

Already running a review

craig · 2023-03-14T18:27:25Z

Build succeeded:

Bazel Essential CI (Cockroach)

miretskiy requested review from ajwerner, tbg, irfansharif and a team February 7, 2023 23:34

miretskiy requested review from a team as code owners February 7, 2023 23:34

miretskiy requested a review from erikgrinaker February 7, 2023 23:40

andreimatei reviewed Feb 8, 2023

View reviewed changes

miretskiy force-pushed the future branch 2 times, most recently from 8adcdfe to 4e2ce15 Compare February 8, 2023 01:14

miretskiy force-pushed the future branch 2 times, most recently from 2c64887 to 5aff403 Compare February 8, 2023 02:07

tbg reviewed Feb 13, 2023

View reviewed changes

pkg/util/future/future.go Outdated Show resolved Hide resolved

tbg mentioned this pull request Feb 13, 2023

[dnm] add goroutine pulser #97028

Closed

tbg added a commit to tbg/cockroach that referenced this pull request Feb 13, 2023

[dnm] add goroutine pulser

fea271c

Exploration that might be useful for cockroachdb#96756.

miretskiy force-pushed the future branch from 5aff403 to 560327c Compare February 16, 2023 21:29

miretskiy force-pushed the future branch 3 times, most recently from 528ca82 to 1834760 Compare March 2, 2023 13:44

miretskiy force-pushed the future branch 2 times, most recently from fb20260 to a90009a Compare March 2, 2023 22:27

dhartunian removed the request for review from a team March 6, 2023 16:03

erikgrinaker reviewed Mar 10, 2023

View reviewed changes

kvcoord: Add metric tracking number of rangefeed registrations

106ef1f

Add a gauge metric keep track of currently active registrations. Epic: None Release note: None

miretskiy force-pushed the future branch from a90009a to c6ebb52 Compare March 13, 2023 21:21

miretskiy force-pushed the future branch from c6ebb52 to 6229ad7 Compare March 13, 2023 22:57

erikgrinaker approved these changes Mar 14, 2023

View reviewed changes

Yevgeniy Miretskiy added 2 commits March 14, 2023 09:30

util: Add a Future[T] library

62a7258

Add a library implementing promise/future abstraction. `Future[T]` describes a value of type T, which will become available in the future. The caller then may wait until future becomes available. Release note: None

miretskiy force-pushed the future branch from 6229ad7 to c3bac9d Compare March 14, 2023 13:31

craig bot merged commit 094df2b into cockroachdb:master Mar 14, 2023

cockroach-teamcity mentioned this pull request Mar 15, 2023

PR #96756 - kvcoord: Eliminate 1 Go routine from MuxRangeFeed cockroachdb/docs#16498

Open

erikgrinaker mentioned this pull request Mar 15, 2023

kv/kvclient/rangefeed: TestRangefeedWithLabelsOption failed #98657

Closed

wenyihu6 mentioned this pull request Jul 11, 2024

kvserver/rangefeed: remove future package #126490

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kvcoord: Eliminate 1 Go routine from MuxRangeFeed #96756

kvcoord: Eliminate 1 Go routine from MuxRangeFeed #96756

miretskiy commented Feb 7, 2023 •

edited

Loading

cockroach-teamcity commented Feb 7, 2023

andreimatei left a comment

miretskiy commented Feb 8, 2023

erikgrinaker commented Feb 8, 2023

miretskiy commented Feb 8, 2023

erikgrinaker commented Feb 8, 2023

miretskiy commented Feb 8, 2023

erikgrinaker commented Feb 8, 2023

miretskiy commented Feb 8, 2023

miretskiy commented Feb 8, 2023

tbg left a comment

tbg commented Feb 13, 2023 •

edited

Loading

miretskiy commented Feb 16, 2023

miretskiy commented Feb 16, 2023

tbg commented Feb 20, 2023

erikgrinaker commented Feb 21, 2023

miretskiy commented Mar 2, 2023

erikgrinaker left a comment

miretskiy commented Mar 13, 2023

miretskiy commented Mar 14, 2023 via email

miretskiy commented Mar 14, 2023

miretskiy commented Mar 14, 2023

craig bot commented Mar 14, 2023

miretskiy commented Mar 14, 2023

craig bot commented Mar 14, 2023

miretskiy commented Mar 14, 2023

craig bot commented Mar 14, 2023

craig bot commented Mar 14, 2023

kvcoord: Eliminate 1 Go routine from MuxRangeFeed #96756

kvcoord: Eliminate 1 Go routine from MuxRangeFeed #96756

Conversation

miretskiy commented Feb 7, 2023 • edited Loading

cockroach-teamcity commented Feb 7, 2023

andreimatei left a comment

Choose a reason for hiding this comment

miretskiy commented Feb 8, 2023

erikgrinaker commented Feb 8, 2023

miretskiy commented Feb 8, 2023

erikgrinaker commented Feb 8, 2023

miretskiy commented Feb 8, 2023

erikgrinaker commented Feb 8, 2023

miretskiy commented Feb 8, 2023

miretskiy commented Feb 8, 2023

tbg left a comment

Choose a reason for hiding this comment

tbg commented Feb 13, 2023 • edited Loading

miretskiy commented Feb 16, 2023

miretskiy commented Feb 16, 2023

tbg commented Feb 20, 2023

erikgrinaker commented Feb 21, 2023

miretskiy commented Mar 2, 2023

erikgrinaker left a comment

Choose a reason for hiding this comment

miretskiy commented Mar 13, 2023

miretskiy commented Mar 14, 2023 via email

miretskiy commented Mar 14, 2023

miretskiy commented Mar 14, 2023

craig bot commented Mar 14, 2023

miretskiy commented Mar 14, 2023

craig bot commented Mar 14, 2023

miretskiy commented Mar 14, 2023

craig bot commented Mar 14, 2023

craig bot commented Mar 14, 2023

miretskiy commented Feb 7, 2023 •

edited

Loading

tbg commented Feb 13, 2023 •

edited

Loading