Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvflowdispatch: move down dispatch mutex to reduce contention #110933

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

aadityasondhi
Copy link
Collaborator

In kv0/enc=false/nodes=3/cpu=96, we noticed mutex contention around the outbox map. This patch tries to alleviate that by moving the mutex down into each individual dispatch map (sharding by NodeID).

Informs: #104154.

Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Sep 19, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aadityasondhi aadityasondhi force-pushed the 20230918.flow-conctrol-dispatch-perf branch 4 times, most recently from 5937a51 to 6dc6c3f Compare September 27, 2023 14:56
@aadityasondhi aadityasondhi force-pushed the 20230918.flow-conctrol-dispatch-perf branch 6 times, most recently from 08950d3 to 0999d98 Compare October 2, 2023 17:48
@aadityasondhi aadityasondhi marked this pull request as ready for review October 2, 2023 19:57
@aadityasondhi aadityasondhi requested a review from a team as a code owner October 2, 2023 19:57
Copy link
Collaborator Author

@aadityasondhi aadityasondhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


-- commits line 2 at r1:
Results of experiments can be found here: https://docs.google.com/document/d/1v7TxSctuHlc7VWq3YJm79HjTlpydFCcQvYTBMSr5qmI/edit.

It seems to alleviate significant mutex contention.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi)


pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch.go line 126 at r1 (raw file):

	if !found || existing.Less(entries.UpToRaftLogPosition) {
		dispatchMap.mu.items[dk] = entries.UpToRaftLogPosition

It is possible to slightly restructure to release the mutex after this update. And remember which part of the if-condition expression was true, to update the metrics after the mutex release.


pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch.go line 206 at r1 (raw file):

func (d *Dispatch) getDispatchMap(nodeID roachpb.NodeID) *dispatches {
	dispatchMap, loaded := d.mu.outbox.LoadOrStore(nodeID, &dispatches{})

This unnecessarily allocates on the heap &dispatches{} for each get.

Also, I don't think this is correct: 2 goroutines can do this concurrently and the first one will return loaded=true and the second loaded=false, but the items map has not yet been initialized. So the second one can then race ahead and try to use the items map, and find it nil.
Worth adding a test with multiple goroutines to show that it fails.

I think we need to document that d.mu.Lock is held for any additions to the sync.Map.
And then first do a d.mu.outbox.Load and if that fails, acquire the mutex, and do a LoadOrStore. This is also the pattern followed in kvflowcontroller.Controller.getBucket.

Can you also add a small benchmark for Dispatch so we can catch any regressions (like the memory allocation)?

@aadityasondhi aadityasondhi force-pushed the 20230918.flow-conctrol-dispatch-perf branch 2 times, most recently from a6831eb to 18abab2 Compare October 3, 2023 18:53
In `kv0/enc=false/nodes=3/cpu=96`, we noticed mutex contention around
the `outbox` map. This patch tries to alleviate that by moving the mutex
down into each individual dispatch map (sharding by NodeID).

Informs: cockroachdb#104154.

Release note: None
@aadityasondhi aadityasondhi force-pushed the 20230918.flow-conctrol-dispatch-perf branch from 77a17ec to a30f9eb Compare October 3, 2023 20:54
Copy link
Collaborator Author

@aadityasondhi aadityasondhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch.go line 126 at r1 (raw file):

Previously, sumeerbhola wrote…

It is possible to slightly restructure to release the mutex after this update. And remember which part of the if-condition expression was true, to update the metrics after the mutex release.

Done.


pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch.go line 206 at r1 (raw file):

Previously, sumeerbhola wrote…

This unnecessarily allocates on the heap &dispatches{} for each get.

Also, I don't think this is correct: 2 goroutines can do this concurrently and the first one will return loaded=true and the second loaded=false, but the items map has not yet been initialized. So the second one can then race ahead and try to use the items map, and find it nil.
Worth adding a test with multiple goroutines to show that it fails.

I think we need to document that d.mu.Lock is held for any additions to the sync.Map.
And then first do a d.mu.outbox.Load and if that fails, acquire the mutex, and do a LoadOrStore. This is also the pattern followed in kvflowcontroller.Controller.getBucket.

Can you also add a small benchmark for Dispatch so we can catch any regressions (like the memory allocation)?

Thanks for catching the race. I was able to verify that it is indeed a race. It is hard to add a test without injecting test stuff into non-test code (I tried something with a stress of 10k runs and still couldn't reproduce it). But I did fix the race condition nonetheless.

Added benchmark in a separate commit.

Copy link
Collaborator Author

@aadityasondhi aadityasondhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Results of benchmark: ./dev bench pkg/kv/kvserver/kvflowcontrol/kvflowdispatch -f BenchmarkDispatch --bench-mem -v

Before:

**

\==================== Test output for //pkg/kv/kvserver/kvflowcontrol/kvflowdispatch:kvflowdispatch\_test:

goos: darwin

goarch: arm64

BenchmarkDispatch

BenchmarkDispatch-10    3224570       372.3 ns/op     432 B/op       5 allocs/op

PASS

external/bazel\_tools/tools/test/test-setup.sh: line 352: 32705 Killed: 9               sleep 10

\================================================================================

**

After:

**

\==================== Test output for //pkg/kv/kvserver/kvflowcontrol/kvflowdispatch:kvflowdispatch\_test:

goos: darwin

goarch: arm64

BenchmarkDispatch

BenchmarkDispatch-10    4400008       273.0 ns/op       48 B/op       1 allocs/op

PASS

\================================================================================

**

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r2, 1 of 1 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aadityasondhi)


pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch.go line 36 at r3 (raw file):

type Dispatch struct {
	mu struct {
		syncutil.Mutex

Add a comment
// Mutex is held for all additions of keys to the map.


pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch.go line 110 at r3 (raw file):

	dispatchMap := d.getDispatchMap(nodeID)
	var existing kvflowcontrolpb.RaftLogPosition
	var found bool

can you rename these:

// existingLess is initialized iff existingFound is true.
var existingFound, existingLess bool

and move existing into the func().

Then in the func:

existing, existingFound = dispatchMap.mu.items[dk]
if existingFound {
   existingLess = existing.Less(entries.UpToRaftLogPosition)
}

And after the func:

if existingFound {
  if existingLess {
    ...
  } else {
    ...
  }
} else {
  ...
}

pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch.go line 140 at r3 (raw file):

			// We're dropping a dispatch given we already have a pending one with a
			// higher log position. Increment the coalesced metric.
			d.metrics.CoalescedDispatches[wc].Inc(1)

Actually, just get rid of existingLess. And merge the two comments.
// We are either:
// - replacing ...
// - dropping a dispatch ...
// In both cases, increment the coalesced metric.


pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch.go line 219 at r3 (raw file):

		defer d.mu.Unlock()
		var loaded bool
		dispatchMap, loaded = d.mu.outbox.LoadOrStore(nodeID, &dispatches{})

I think this is not quite correct.
Once this LoadOrStore finishes, any concurrent call to d.mu.outbox.Load will return ok but the map has not been allocated yet.
We need to pass &dispatches{make(map[dispatchKey]kvflowcontrolpb.RaftLogPosition)} to the LoadOrStore call.


pkg/kv/kvserver/kvflowcontrol/kvflowdispatch/kvflowdispatch_test.go line 209 at r2 (raw file):

				UpToRaftLogPosition: kvflowcontrolpb.RaftLogPosition{
					Term:  1,
					Index: 10,

Can you make this

Index: i*5 + j

That way the Index will advance and Dispatch should be doing some work in extending what needs to be dispatched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants