Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: Run one ScanInterleavedIntents per store concurrently #69607

Merged
merged 1 commit into from
Sep 2, 2021

Conversation

itsbilal
Copy link
Member

Currently, lock table migration related requests (Barrier,
ScanInterleavedIntents, PushTxn, ResolveIntent), are only run
with a concurrency of 4 regardless of the size of the cluster.

This change increases that to 4 * numNodes, and adds a new
mechanism to limit the ScanInterleavedIntents on the receiver
side to 1 per store. This throttle is applied before latches
are grabbed. Only ScanInterleavedIntents needs to be
rate-limited as it's by far the heaviest component in
the separated intents migrations.

Release justification: Category 2, low-risk update to new functionality
Release note: None.

@itsbilal itsbilal requested review from dt and sumeerbhola August 30, 2021 20:15
@itsbilal itsbilal self-assigned this Aug 30, 2021
@itsbilal itsbilal requested a review from a team as a code owner August 30, 2021 20:15
@itsbilal itsbilal requested a review from a team August 30, 2021 20:15
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ajwerner ajwerner removed the request for review from a team August 30, 2021 20:35
pkg/kv/kvserver/replica_send.go Outdated Show resolved Hide resolved
@itsbilal itsbilal force-pushed the intent-migration-semaphore branch from 4a0035c to 574b610 Compare August 31, 2021 16:02
@dt
Copy link
Member

dt commented Aug 31, 2021

I think we might actually want to make this setting-tunable. Like, these are basically like ExportRequests, right? They both have to scan a large span and look for intents. You could potentially just copy/paste the ConcurrentExportRequests limiter, adding to the limiters field that is already an attribute of each store, to avoid needing to keep your own map, and so you'd also have a Settings.SV around to setup the on-change hook. I dunno.

@itsbilal itsbilal force-pushed the intent-migration-semaphore branch from 574b610 to 76d57c2 Compare August 31, 2021 17:47
@itsbilal
Copy link
Member Author

Thanks for the review! Yep, making it settings-tunable is probably a good idea. That way we can stuff the 4x node multiplier default into the cluster setting and have that drive that multiplier instead. I'll make that change shortly. Addressed the other points.

@itsbilal itsbilal force-pushed the intent-migration-semaphore branch from 76d57c2 to 94d7ddf Compare August 31, 2021 19:14
@itsbilal
Copy link
Member Author

Done. Slight update to last comment: I didn't put the 4x fan-out-time node multiplier factor in the setting, but rather the 1x store multiplier in it. That's less useful as there's no easy way to "slow it down" by reducing it any lower than 1 (0 would effectively stop rate limiting as the limiter will just return an error upon Begin); the operator could only speed it up from the default if they want to get the migration out of the way. Maybe that's fine, but something worth noting.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 8 files at r1, 4 of 4 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal)


pkg/kv/kvserver/replica_send.go, line 288 at r4 (raw file):

		reservation, err := r.store.limiters.ConcurrentScanInterleavedIntents.Begin(ctx)
		if err != nil {
			return nil

why are we ignoring the error?


pkg/kv/kvserver/replica_send.go, line 355 at r4 (raw file):

	// Handle any request-specific pre-latch throttles.
	if res := r.maybeThrottleRequestPreLatch(ctx, ba); res != nil {

Why do this here instead of earlier in Store.Send in maybeThrottleBatch -- it fits well with the other logic there?

Currently, lock table migration related requests (Barrier,
ScanInterleavedIntents, PushTxn, ResolveIntent), are only run
with a concurrency of 4 regardless of the size of the cluster.

This change increases that to 4 * numNodes, and adds a new
mechanism to limit the ScanInterleavedIntents on the receiver
side to 1 per store. This throttle is applied before latches
are grabbed. Only ScanInterleavedIntents needs to be
rate-limited as it's by far the heaviest component in
the separated intents migrations.

Release justification: Category 2, low-risk update to new functionality
Release note: None.
@itsbilal itsbilal force-pushed the intent-migration-semaphore branch from 94d7ddf to 9e720b6 Compare August 31, 2021 23:42
Copy link
Member Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @sumeerbhola from 2 discussions.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt, @itsbilal, and @sumeerbhola)


pkg/kv/kvserver/replica_send.go, line 292 at r1 (raw file):

Previously, dt (David Taylor) wrote…

2¢: I donno why I'm slightly uneasy about the panic, as well as the map lookup, plus being getOrCreate here (we would never want to create, then release). I might vote to have maybeThrottle just return a limiter.Reservation and if the caller gets a non-nil reservation, they have to defer releasing it.

Done.


pkg/kv/kvserver/replica_send.go, line 288 at r4 (raw file):

Previously, sumeerbhola wrote…

why are we ignoring the error?

Done.


pkg/kv/kvserver/replica_send.go, line 355 at r4 (raw file):

Previously, sumeerbhola wrote…

Why do this here instead of earlier in Store.Send in maybeThrottleBatch -- it fits well with the other logic there?

Done. I had missed that as I was looking at replica_send.go and not store_send.go.


pkg/kv/kvserver/batcheval/cmd_scan_interleaved_intents.go, line 25 at r1 (raw file):

Previously, dt (David Taylor) wrote…

I think we like this thing better (that sem isn't fair) https://github.com/cockroachdb/cockroach/blob/master/pkg/util/limit/limiter.go#L47

Done.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r5, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt)

@itsbilal
Copy link
Member Author

itsbilal commented Sep 2, 2021

TFTRs!

bors r=dt

@craig
Copy link
Contributor

craig bot commented Sep 2, 2021

Build succeeded:

@craig craig bot merged commit 0dc7f4b into cockroachdb:master Sep 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants