-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: benchmark overhead of replication ac v2 on simple workloads #128033
Labels
A-replication-admission-control-v2
Related to introduction of replication AC v2
branch-release-24.3
Used to mark GA and release blockers, technical advisories, and bugs for 24.3
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Comments
kvoli
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-replication-admission-control-v2
Related to introduction of replication AC v2
labels
Jul 31, 2024
26 tasks
Extending the scope of this issue to include all of:
cc @sumeerbhola Note, not all machine types are necessary because that explodes the combinations, but best effort, perhaps at least 1-2 for each. |
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Oct 7, 2024
Informs cockroachdb#128033 Epic: CRDB-37515 Release note: None
craig bot
pushed a commit
that referenced
this issue
Oct 8, 2024
132137: replica_rac2: prevent unnecessary heap allocation of admitted vector r=pav-kv a=sumeerbhola Informs #128033 Epic: CRDB-37515 Release note: None Co-authored-by: sumeerbhola <[email protected]>
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Oct 9, 2024
- Scratch for []entryFCState for the new entries being appended. - Scratch for []tokenWaitingHandleInfo in WaitForEval. - Accumulate the (send or eval) tokens to deduct and then make one call to the shared tokenCounter. This avoids repeated calls to PhysicalTime() and repeated acquisitons of a possibly contended mutex. Informs cockroachdb#128033 Epic: CRDB-37515 Release note: None
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Oct 9, 2024
…version We were exercising v1 codepaths that were not doing anything useful but consuming 1+% of CPU. Informs cockroachdb#128033 Epic: CRDB-37515 Release note: None
craig bot
pushed a commit
that referenced
this issue
Oct 10, 2024
132267: kvserver: disable v1 if the node starts with a recent enough cluster … r=kvoli a=sumeerbhola …version We were exercising v1 codepaths that were not doing anything useful but consuming 1+% of CPU. Informs #128033 Epic: CRDB-37515 Release note: None Co-authored-by: sumeerbhola <[email protected]>
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Oct 10, 2024
- Scratch for []entryFCState for the new entries being appended. - Scratch for []tokenWaitingHandleInfo in WaitForEval. - Accumulate the (send or eval) tokens to deduct and then make one call to the shared tokenCounter. This avoids repeated calls to PhysicalTime() and repeated acquisitons of a possibly contended mutex. Informs cockroachdb#128033 Epic: CRDB-37515 Release note: None
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Oct 10, 2024
This is a micro-optimization, plus avoids exposing Raft details that the three callers are not interested in. BasicProgress only contains {State, Match, Next} and there is no sorting of peer ids before the visitor is called. This method consumes 0.6% in kv0 running with RACv2, of which more than half is calls from RACv2 code. Informs cockroachdb#128033 Epic: CRDB-37515 Release note: None
craig bot
pushed a commit
that referenced
this issue
Oct 10, 2024
132129: roachtest: add slow disk perturbation test r=kvoli a=andrewbaptist This change adds a new set of perturbation tests perturbation/*/slowDisk which tests slow disks. We have see support cases where slow disks can cause cluster level availability outages. Epic: none Release note: None 132254: rac2: small allocation optimizations in rangeController r=kvoli a=sumeerbhola - Scratch for []entryFCState for the new entries being appended. - Scratch for []tokenWaitingHandleInfo in WaitForEval. - Accumulate the (send or eval) tokens to deduct and then make one call to the shared tokenCounter. This avoids repeated calls to PhysicalTime() and repeated acquisitons of a possibly contended mutex. Informs #128033 Epic: CRDB-37515 Release note: None 132313: raft,kvserver: add RawNode.WithBasicProgress and BasicProgress struct r=pav-kv,kvoli a=sumeerbhola This is a micro-optimization, plus avoids exposing Raft details that the three callers are not interested in. BasicProgress only contains {State, Match, Next} and there is no sorting of peer ids before the visitor is called. This method consumes 0.6% in kv0 running with RACv2, of which more than half is calls from RACv2 code. Informs #128033 Epic: CRDB-37515 Release note: None 132315: raft: don't panic in Inflights.Add r=kvoli a=pav-kv In lazy replication mode, the inflights struct should not enforce the in-flight limit. Instead, this policy is implemented at the higher level (RACv2). The tracker nevertheless correctly tracks all the in-flight state, so that it can correctly switch in/out of the lazy replication. Part of #128779 Co-authored-by: Andrew Baptist <[email protected]> Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Pavel Kalinnikov <[email protected]>
kvoli
added
GA-blocker
branch-release-24.3
Used to mark GA and release blockers, technical advisories, and bugs for 24.3
labels
Oct 14, 2024
craig bot
pushed a commit
that referenced
this issue
Oct 17, 2024
132140: replica_rac2: optimize logTracker.admitted r=sumeerbhola a=pav-kv Improve cache friendliness by checking the `dirty` and scheduling bits before resetting them. Cache the latest admitted vector to avoid computing it on every call. Instead, it is recomputed only if `dirty` bit is true. Informs #128033 Co-authored-by: Pavel Kalinnikov <[email protected]>
sumeerbhola
added a commit
to sumeerbhola/cockroach
that referenced
this issue
Nov 13, 2024
Fixed a bug that was using incorrect number of iterations. And made the benchmark more stable, and useful for smaller number of iterations, by using OverrideTokenDeduction to deduct 1MiB of tokens for tiny writes. Informs cockroachdb#128033 Epic: CRDB-37515 Release note: None
craig bot
pushed a commit
that referenced
this issue
Nov 14, 2024
135121: kvserver: fixes to BenchmarkFlowControlV2Basic r=kvoli a=sumeerbhola Fixed a bug that was using incorrect number of iterations. And made the benchmark more stable, and useful for smaller number of iterations, by using OverrideTokenDeduction to deduct 1MiB of tokens for tiny writes. Informs #128033 Epic: CRDB-37515 Release note: None Co-authored-by: sumeerbhola <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-replication-admission-control-v2
Related to introduction of replication AC v2
branch-release-24.3
Used to mark GA and release blockers, technical advisories, and bugs for 24.3
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-kv
KV Team
Once replication AC V2 is introduced, we should benchmark the overhead of running in V2 vs V1 for a simple, operation intensive workload like
kv95/50/0
.We want to avoid introducing any related performance regressions, which aren't explicitly considered a worthwhile tradeoff.
Jira issue: CRDB-40745
Epic CRDB-37515
The text was updated successfully, but these errors were encountered: