-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: Don't automatically rerun replicate_queue #85219
Conversation
40731c7
to
c19954a
Compare
c19954a
to
e783fc4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)
pkg/kv/kvserver/replicate_queue.go
line 550 at r1 (raw file):
Previously, kvoli (Austen) wrote…
I think this should go within the if statement below.
Done
0215bd5
to
6d38e46
Compare
This changes the replicate_queue does not automatically run the next action, but instead queues it up to run at the appropriate priority. Doing this created 2 unit test failures, that I dealt with differently, but I wanted to see if anyone had concerns on this:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist and @kvoli)
-- commits
line 6 at r2:
nit: done to a range, rather than replica.
-- commits
line 7 at r2:
nit: replic -> replica.
pkg/kv/kvserver/replicate_queue.go
line 925 at r2 (raw file):
// We require the lease in order to process replicas, so // repl.store.StoreID() corresponds to the lease-holder's store ID. _, err := rq.shedLease(
If we did shed the lease here, then there shouldn't be a requeue for this replica because it won't have a lease. Can we set !lhBeingRemoved=true
if rq.ShedLease
returns true.
The replica in the queue will try and acquire the lease and (likely) fail the raft leader check its processed again, we can avoid the added processing.
pkg/kv/kvserver/replica_learner_test.go
line 751 at r2 (raw file):
tc := testcluster.StartTestCluster(t, 3, base.TestClusterArgs{ ServerArgs: base.TestServerArgs{Knobs: knobs}, ReplicationMode: base.ReplicationManual,
since it’s in replication manual mode, the requeues in the process loop are being ignored. So it won't be testing your patch completely here.
cockroach/pkg/kv/kvserver/queue.go
Lines 639 to 641 in 9b5a2b7
bq.mu.Lock() | |
stopped := bq.mu.stopped || bq.mu.disabled | |
bq.mu.Unlock() |
pkg/kv/kvserver/replica_learner_test.go
line 773 at r2 (raw file):
} for _, message := range expectedMessages { trace, processErr, err := store.Enqueue(
See below on manually requeuing.
pkg/kv/kvserver/replica_learner_test.go
line 797 at r2 (raw file):
allocatorimpl.AllocatorFinalizeAtomicReplicationChange, allocatorimpl.AllocatorAddVoter, allocatorimpl.AllocatorFinalizeAtomicReplicationChange,
This is in the scope for the second stop, ltk.withStopAfterJointConfig, it will never finalize the atomic change itself until another replicate queue does it. So there are two AllocatorFinalizeAtomicReplicationChanges here.
pkg/kv/kvserver/replica_learner_test.go
line 801 at r2 (raw file):
} for _, message := range expectedMessages { trace, processErr, err := store.Enqueue(
It would be good to test the requeuing functionality does what is expected for this test, rather than manually requeuing.
You could try disabling the replica scanner:
Then after ltk.withStopAfterLearnerAtomic you then enable the replicate queue.
Then assert on the action in processOneChange:
cockroach/pkg/kv/kvserver/replicate_queue.go
Line 602 in 9b5a2b7
log.VEventf(ctx, 1, "next replica action: %s", action) |
Likewise for the other stop.
c7c6bc5
to
8713e58
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kvoli)
pkg/kv/kvserver/replicate_queue.go
line 925 at r2 (raw file):
Previously, kvoli (Austen) wrote…
If we did shed the lease here, then there shouldn't be a requeue for this replica because it won't have a lease. Can we set
!lhBeingRemoved=true
ifrq.ShedLease
returns true.The replica in the queue will try and acquire the lease and (likely) fail the raft leader check its processed again, we can avoid the added processing.
OK, good call on this.
pkg/kv/kvserver/replica_learner_test.go
line 751 at r2 (raw file):
Previously, kvoli (Austen) wrote…
since it’s in replication manual mode, the requeues in the process loop are being ignored. So it won't be testing your patch completely here.
cockroach/pkg/kv/kvserver/queue.go
Lines 639 to 641 in 9b5a2b7
bq.mu.Lock() stopped := bq.mu.stopped || bq.mu.disabled bq.mu.Unlock()
OK, changed the test
pkg/kv/kvserver/replica_learner_test.go
line 773 at r2 (raw file):
Previously, kvoli (Austen) wrote…
See below on manually requeuing.
OK, done
pkg/kv/kvserver/replica_learner_test.go
line 797 at r2 (raw file):
Previously, kvoli (Austen) wrote…
This is in the scope for the second stop, ltk.withStopAfterJointConfig, it will never finalize the atomic change itself until another replicate queue does it. So there are two AllocatorFinalizeAtomicReplicationChanges here.
OK, makes sense
pkg/kv/kvserver/replica_learner_test.go
line 801 at r2 (raw file):
Previously, kvoli (Austen) wrote…
It would be good to test the requeuing functionality does what is expected for this test, rather than manually requeuing.
You could try disabling the replica scanner:
Then after ltk.withStopAfterLearnerAtomic you then enable the replicate queue.
Then assert on the action in processOneChange:
cockroach/pkg/kv/kvserver/replicate_queue.go
Line 602 in 9b5a2b7
log.VEventf(ctx, 1, "next replica action: %s", action) Likewise for the other stop.
Done, this idea worked well
13d1e6c
to
6fe2a3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andrewbaptist, and @kvoli)
pkg/kv/kvserver/allocator/allocatorimpl/allocator.go
line 628 at r3 (raw file):
} return a.computeAction(ctx, conf, desc.Replicas().VoterDescriptors(),
Why did you drop the internal method computeAction
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @kvoli)
pkg/kv/kvserver/allocator/allocatorimpl/allocator.go
line 628 at r3 (raw file):
Previously, kvoli (Austen) wrote…
Why did you drop the internal method
computeAction
here?
The method was only called in one case by the method right above it and artificially split between what was checked in the first vs the second. I should probably not have included this change in this PR though (or maybe as a separate commit) as it is unrelated to some of the other changes. It just threw me off first when I was looking at ComputeAction vs computeAction and I was trying to understand why we needed some checks in the first method and some in the second. When I realized there was no technical reason for the split, it seemed cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems significant enough to warrant some small roachprod testing to see the effects "overall" and make sure everything works as expected.
You mentioned @AlexTalks has a good test. You could run the largest existing roachtest (compare to a control).
./build/builder.sh mkrelease
dev build roachtest
roachtest run decommissionBench/nodes=8/cpu=16/warehouses=3000
roachprod list
# find your cluster to get the $cluster_name - it will be your username-xxxx
roachprod adminui $cluster_name
# monitor the test.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andrewbaptist, and @kvoli)
pkg/kv/kvserver/allocator/allocatorimpl/allocator.go
line 628 at r3 (raw file):
Previously, andrewbaptist (Andrew Baptist) wrote…
The method was only called in one case by the method right above it and artificially split between what was checked in the first vs the second. I should probably not have included this change in this PR though (or maybe as a separate commit) as it is unrelated to some of the other changes. It just threw me off first when I was looking at ComputeAction vs computeAction and I was trying to understand why we needed some checks in the first method and some in the second. When I realized there was no technical reason for the split, it seemed cleaner.
Given this pattern is used for most of the allocator methods, where there is a mostly empty exported method calling an unexported method - It would be better to leave it out of this patch, then reconsider all of them at once rather than adhoc.
6fe2a3c
to
3833958
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andrewbaptist, and @kvoli)
pkg/kv/kvserver/replicate_queue.go
line 925 at r2 (raw file):
Previously, andrewbaptist (Andrew Baptist) wrote…
OK, good call on this.
Discussed in person - want to take a look to see if this is the correct place to do this (or what else needs to/should change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks, @andrewbaptist, and @kvoli)
pkg/kv/kvserver/replicate_queue.go
line 532 at r4 (raw file):
// few seconds. By retrying immediately we will choose the "second best" // node to send to. // TODO: This is probably suboptimal behavior. In the case where there is
nit: TODO(andrewbaptist)
(or, for that matter, put TODO(sarkesian)
if needed).
I think the linter would catch this in any case.
3833958
to
7b042bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested with Alex, still not "ideal" behavior because stores finish at different times, but better. Alex is analyzing results
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @AlexTalks and @kvoli)
pkg/kv/kvserver/replicate_queue.go
line 925 at r2 (raw file):
Previously, AlexTalks (Alex Sarkesian) wrote…
Discussed in person - want to take a look to see if this is the correct place to do this (or what else needs to/should change).
OK, The test TestLeasePreferencesDuringOutage fails without this change. Specifically, it is expecting a lease change to happen after we addOrReplaceVoters is called. This test passed before because after we re-ran the replicate queue, we would end up calling considerRebalance
which would then rebalance and as a fortunate result also move the lease. This test was a little broken in doing what it is trying to do since if there wasn't a rebalance, this wouldn't have worked. Now the test is a little more accurate. It might make sense long term to always call maybeTransferLeaseAway after any change is made since normally it won't do anything if lease preference is already correct, but that could be overkill.
pkg/kv/kvserver/replicate_queue.go
line 532 at r4 (raw file):
Previously, AlexTalks (Alex Sarkesian) wrote…
nit:
TODO(andrewbaptist)
(or, for that matter, putTODO(sarkesian)
if needed).I think the linter would catch this in any case.
Done
pkg/kv/kvserver/allocator/allocatorimpl/allocator.go
line 628 at r3 (raw file):
Previously, kvoli (Austen) wrote…
Given this pattern is used for most of the allocator methods, where there is a mostly empty exported method calling an unexported method - It would be better to leave it out of this patch, then reconsider all of them at once rather than adhoc.
Done, reverted in this PR
9c249ec
to
b1e2680
Compare
This naming was incorrect after cockroachdb#85219. The replicate queue will re-queue , rather than immediately re-process a replica after a successful process. Epic: none Release note: None
This naming was incorrect after cockroachdb#85219. The replicate queue will re-queue , rather than immediately re-process a replica after a successful process. Epic: none Release note: None
This naming was incorrect after #85219. The replicate queue will re-queue , rather than immediately re-process a replica after a successful process. Epic: none Release note: None
This naming was incorrect after #85219. The replicate queue will re-queue , rather than immediately re-process a replica after a successful process. Epic: none Release note: None
Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After cockroachdb#85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After cockroachdb#94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with cockroachdb#85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: cockroachdb#88769 Release note: None
Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After cockroachdb#85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After cockroachdb#94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with cockroachdb#85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: cockroachdb#88769 Release note: None
Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After cockroachdb#85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After cockroachdb#94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with cockroachdb#85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: cockroachdb#88769 Release note: None
Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After cockroachdb#85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After cockroachdb#94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with cockroachdb#85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: cockroachdb#88769 Release note: None
108175: kvserver: unskip lease preferences during outage r=andrewbaptist a=kvoli Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After #85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After #94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with #85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: #88769 Release note: None 109432: cluster-ui: handle partial response errors on the database details page r=THardy98 a=THardy98 Part of: #102386 **Demos** (Note: these demos show this same logic applied to both the databases and database table pages as well): DB-Console - https://www.loom.com/share/5108dd655ad342f28323e72eaf68219c - https://www.loom.com/share/1973383dacd7494a84e10bf39e5b85a3 This change applies the same error handling ideas from #109245 to the database details page, enabling non-admin users to use the database details page and providing better transparency to data fetching issues. Errors encountered while fetching table details can be viewed via the tooltip provided by the `Caution` icon at the table's name. `unavailable` cells also provide a tooltip that displays the error impacting that exact cell. Release note (ui change): Non-admin users are able to use the database details page. 110292: c2c: use seperate spanConfigEventStreamSpec in the span config event stream r=stevendanna a=msbutler Previously, the spanConfigEventStream used a streamPartitionSpec, which contained a bunch of fields unecessary for span config streaming. This patch creates a new spanConfigEventStreamSpec which contains the fields only necessary for span config event streaming. Informs #109059 Release note: None 110309: teamcity-trigger: ensure that `race` tag is only passed once r=healthy-pod a=healthy-pod By running under `-race`, the go command defines the `race` build tag for us [1]. Previously, we defined it under `TAGS` to let the issue poster know that this is a failure under `race` and indicate that in the issue. At the time, defining the tag twice didn't cause issues but after #109773, it led to build failures [2]. To reproduce locally: ``` bazel test -s --config=race pkg/util/ctxgroup:all --test_env=GOTRACEBACK=all --define gotags=bazel,gss,race ``` As a follow-up, we should find another way to let the issue poster know that a failure was running under `race`. [1] https://go.dev/doc/articles/race_detector#Excluding_Tests [2] #109994 (comment) Epic: none Release note: None Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Thomas Hardy <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: healthy-pod <[email protected]>
Previously after the replicate queue completed processing a
change, it would immediately see if there were other changes
that could be done to a range. This bypassed the normal
priority order and meant that a replica might process a low
priority, expensive job before a high priority one. The intented
behavior is to check if there is additional work for the replica
and requeue at the correct priority.
Release note: None