-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: fix clearrange/* tests #104699
Merged
craig
merged 1 commit into
cockroachdb:master
from
irfansharif:230610.deflake-clearrange
Jun 10, 2023
Merged
kvserver: fix clearrange/* tests #104699
craig
merged 1 commit into
cockroachdb:master
from
irfansharif:230610.deflake-clearrange
Jun 10, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fixes cockroachdb#104696. Fixes cockroachdb#104697. Fixes cockroachdb#104698. Part of cockroachdb#98703. In 072c16d (added as part of cockroachdb#95637) we re-worked the locking structure around the RaftTransport's per-RPC class level send queues. When new send queues are instantiated or old ones deleted, we now also maintain the kvflowcontrol connection tracker, so such maintenance now needs to happen while holding a kvflowcontrol mutex. When rebasing \cockroachdb#95637 onto master, we accidentally included earlier queue deletion code without holding the appropriate mutex. Queue deletions now happened twice which made it possible to hit a RaftTransport assertion about expecting the right send queue to already exist. Specifically, the following sequence was possible: - (*RaftTransport).SendAsync is invoked, observes no queue for <nodeid,class>, creates it, and tracks it in the queues map. - It invokes an async worker W1 to process that send queue through (*RaftTransport).startProcessNewQueue. The async worker is responsible for clearing the tracked queue in the queues map once done. - W1 expects to find the tracked queue in the queues map, finds it, proceeds. - W1 is done processing. On its way out, W1 clears <nodeid,class> from the queues map the first time. - (*RaftTransport).SendAsync is invoked by another goroutine, observes no queue for <nodeid,class>, creates it, and tracks it in the queues map. - It invokes an async worker W2 to process that send queue through (*RaftTransport).startProcessNewQueue. The async worker is responsible for clearing the tracked queue in the queues map once done. - W1 blindly clears the <nodeid,class> raft send queue the second time. - W2 expects to find the queue in the queues map, but doesn't, and fatals. Release note: None
It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR? 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
erikgrinaker
approved these changes
Jun 10, 2023
bors r+ |
Build succeeded: |
irfansharif
added a commit
to irfansharif/cockroach
that referenced
this pull request
Jun 12, 2023
Enable kvadmission.flow_control.enabled by default. We didn't observe noticeable performance regressions while it was disabled (single weekly run, three nightly runs). There was some minimal fallout that was since fixed (cockroachdb#104699). We expect performance regressions now that this commit enables it by default, and expect more fallout. We'll handle these as part of cockroachdb#104154. Release note: None
craig bot
pushed a commit
that referenced
this pull request
Jun 12, 2023
104741: kvflowcontrol: enable by default r=irfansharif a=irfansharif Enable kvadmission.flow_control.enabled by default. We didn't observe noticeable performance regressions while it was disabled (single weekly run, three nightly runs). There was some minimal fallout that was since fixed (#104699). We expect performance regressions now that this commit enables it by default, and expect more fallout. We'll handle these as part of #104154. Release note: None Co-authored-by: irfan sharif <[email protected]>
15 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #104696.
Fixes #104697.
Fixes #104698.
Part of #98703.
In 072c16d (added as part of #95637) we re-worked the locking structure around the RaftTransport's per-RPC class level send queues. When new send queues are instantiated or old ones deleted, we now also maintain the kvflowcontrol connection tracker, so such maintenance now needs to happen while holding a kvflowcontrol mutex. When rebasing #95637 onto master, we accidentally included earlier queue deletion code without holding the appropriate mutex. Queue deletions now happened twice which made it possible to hit a RaftTransport assertion about expecting the right send queue to already exist.
Specifically, the following sequence was possible:
(*RaftTransport).SendAsync
is invoked, observes no queue for<nodeid,class>
, creates it, and tracks it in the queues map.(*RaftTransport).startProcessNewQueue
. The async worker is responsible for clearing the tracked queue in the queues map once done.<nodeid,class>
from the queues map the first time.(*RaftTransport).SendAsync
is invoked by another goroutine, observes no queue for <nodeid,class>, creates it, and tracks it in the queues map.(*RaftTransport).startProcessNewQueue
. The async worker is responsible for clearing the tracked queue in the queues map once done.<nodeid,class>
raft send queue the second time.Release note: None