-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
admission: lock work queue before reading waiting length #131109
admission: lock work queue before reading waiting length #131109
Conversation
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
d5f5c7c
to
68e39b4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @kvoli and @pav-kv)
pkg/util/admission/work_queue.go
line 926 at r1 (raw file):
if log.V(1) { log.Infof(q.ambientCtx, "async-path: len(waiting-work)=%d dequeued t%d pri=%s r%s origin=n%s log-position=%s ingested=%t", tenant.waitingWorkHeap.Len(),
can you fix this one too, since q.mu.Unlock()
has already been called.
When replicated work is submitted for admission, it returns early and admission proceeds asynchronously to the caller. When `V(1)` is enabled, we also log the current queue length in this code path, which is protected by a mutex that wasn't acquired previously. Acquire the queue mutex when `V(1)` is enabled to prevent a race. Part of: cockroachdb#130187 Release note: None
68e39b4
to
2eca0bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TYFTR!
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @pav-kv and @sumeerbhola)
pkg/util/admission/work_queue.go
line 926 at r1 (raw file):
Previously, sumeerbhola wrote…
can you fix this one too, since
q.mu.Unlock()
has already been called.
Done.
bors r=sumeerbhola |
130728: kvserver: add rac2 v1 integration tests r=sumeerbhola a=kvoli 1st commit from #130619. 2nd-3rd commits from #131106. 4th-5th commits from #131107. 6th-7th commits from #131108. 8th commit from #131109. --- Introduce several tests in `flow_control_integration_test.go`, mirroring the existing tests but applied to the replication flow control v2 machinery. The tests largely follow an identical pattern to the existing v1 tests, swapping in rac2 metrics and vtables. The following tests are added: ``` TestFlowControlBasicV2 TestFlowControlRangeSplitMergeV2 TestFlowControlBlockedAdmissionV2 TestFlowControlAdmissionPostSplitMergeV2 TestFlowControlCrashedNodeV2 TestFlowControlRaftSnapshotV2 TestFlowControlRaftMembershipV2 TestFlowControlRaftMembershipRemoveSelfV2 TestFlowControlClassPrioritizationV2 TestFlowControlQuiescedRangeV2 TestFlowControlUnquiescedRangeV2 TestFlowControlTransferLeaseV2 TestFlowControlLeaderNotLeaseholderV2 TestFlowControlGranterAdmitOneByOneV2 ``` These tests all have at least two variants: ``` V2EnabledWhenLeaderV1Encoding V2EnabledWhenLeaderV2Encoding ``` When `V2EnabledWhenLeaderV1Encoding` is run, the tests use a different testdata file, which has a `_v1_encoding` suffix. A separate file is necessary because when the protocol enablement level is `V2EnabledWhenLeaderV1Encoding`, all entries which are subject to admission control are encoded as `raftpb.LowPri`, regardless of their original priority, as we don't want to pay the cost to deserialize the raft admission meta. The v1 encoding variants retain the same comments as the v2 encoding, however any comments referring to regular tokens should be interpreted as elastic tokens instead, due to the above. Two v1 tests are not ported over to v2: ``` TestFlowControlRaftTransportBreak TestFlowControlRaftTransportCulled ``` These omitted tests behave identically to `TestFlowControlCrashedNodeV2` as rac2 is less tightly coupled to the raft transport, instead operating on replication states (e.g., `StateProbe`, `StateReplicate`). --- Add `TestFlowControlV1ToV2Transition`, which ratchets up the enabled version of replication flow control v2: ``` v1 protocol with v1 encoding => v2 protocol with v1 encoding => v2 protocol with v2 encoding ``` The test is structured to issue writes and wait for returned tokens whenever the protocol transitions from v1 to v2, or a leader changes. More specifically, the test takes the following steps: ``` (1) Start n1, n2, n3 with v1 protocol and v1 encoding. (2) Upgrade n1 to v2 protocol with v1 encoding. (3) Transfer the range lease to n2. (4) Upgrade n2 to v2 protocol with v1 encoding. (5) Upgrade n3 to v2 protocol with v1 encoding. (6) Upgrade n1 to v2 protocol with v2 encoding. (7) Transfer the range lease to n1. (8) Upgrade n2,n3 to v2 protocol with v2 encoding. (9) Transfer the range lease to n3. ``` Between each step, we issue writes, (un)block admission and observe the flow control metrics and vtables. Resolves: #130431 Resolves: #129276 Release note: None 131252: roachtest: port decommission/mixed-versions r=srosenberg,DarrylWong a=renatolabs This commit ports the `decommission/mixed-versions` roachtest to use the `mixedversion` framework (instead of the old `newUpgradeTest` API). It also updates `acceptance/decommission-self` since both tests used shared functionality that needed to be updated. Prior to this commit, the acceptance test used the old upgrade test API even though it was not an upgrade test. Fixes: #110531 Fixes: #110530 Release note: None 131364: upgrades: give test an additional core under remote exec r=rail a=rickystewart This has been timing out. Epic: none Release note: None Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Renato Costa <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
When replicated work is submitted for admission, it returns early and
admission proceeds asynchronously to the caller. When
V(1)
is enabled,we also log the current queue length in this code path, which is
protected by a mutex that wasn't acquired previously.
Acquire the queue mutex when
V(1)
is enabled to prevent a race.Part of: #130187
Release note: None