-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: add shed lease target to repl queue #94023
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kvoli
force-pushed
the
221220.rq-lhtransfer
branch
from
December 21, 2022 13:53
6c76e28
to
15e82a3
Compare
andrewbaptist
approved these changes
Dec 30, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks)
TYFTR bors r=andrewbaptist |
Build failed (retrying...): |
bors r- |
Canceled. |
kvoli
force-pushed
the
221220.rq-lhtransfer
branch
from
January 3, 2023 17:42
15e82a3
to
18d462b
Compare
Previously, a call to `shedLease` was made within the process loop and outside of planning, in the replicate queue. This patch moves the shed lease into consider rebalance, where it originally was and converts it into a plannable action. Release note: None
kvoli
force-pushed
the
221220.rq-lhtransfer
branch
from
January 3, 2023 20:22
18d462b
to
f2c3e12
Compare
bors r=andrewbaptist TYFTR |
Build succeeded: |
kvoli
added a commit
to kvoli/cockroach
that referenced
this pull request
Aug 11, 2023
Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After cockroachdb#85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After cockroachdb#94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with cockroachdb#85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: cockroachdb#88769 Release note: None
kvoli
added a commit
to kvoli/cockroach
that referenced
this pull request
Aug 29, 2023
Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After cockroachdb#85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After cockroachdb#94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with cockroachdb#85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: cockroachdb#88769 Release note: None
kvoli
added a commit
to kvoli/cockroach
that referenced
this pull request
Sep 7, 2023
Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After cockroachdb#85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After cockroachdb#94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with cockroachdb#85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: cockroachdb#88769 Release note: None
kvoli
added a commit
to kvoli/cockroach
that referenced
this pull request
Sep 11, 2023
Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After cockroachdb#85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After cockroachdb#94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with cockroachdb#85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: cockroachdb#88769 Release note: None
craig bot
pushed a commit
that referenced
this pull request
Sep 11, 2023
108175: kvserver: unskip lease preferences during outage r=andrewbaptist a=kvoli Previously, `TestLeasePreferenceDuringOutage` would force replication queue processing of the test range, then assert that the range up-replicated and lease transferred to a preferred locality. This test was skipped, and two of the assumptions it relied on to pass were no longer true. After #85219, the replicate queue no longer re-processes replicas. Instead, the queue requeues replicas after processing, at the appropriate priority. This broke the test due to the replicate queue being disabled, making the re-queue a no-op. After #94023, the replicate queue no longer looked for lease transfers, after processing a replication action. Combined with #85219, the queue would now be guaranteed to not process both up-replication and lease transfers from a single enqueue. Update the test to not require a manual process, instead using a queue range filter, which allows tests which disable automatic replication, to still process filtered ranges via the various replica queues. Also, ensure that the non-stopped stores are considered live targets, after simulating an outage (bumping manual clocks, stopping servers) -- so that the expected up-replication, then lease transfer can proceed. Fixes: #88769 Release note: None 109432: cluster-ui: handle partial response errors on the database details page r=THardy98 a=THardy98 Part of: #102386 **Demos** (Note: these demos show this same logic applied to both the databases and database table pages as well): DB-Console - https://www.loom.com/share/5108dd655ad342f28323e72eaf68219c - https://www.loom.com/share/1973383dacd7494a84e10bf39e5b85a3 This change applies the same error handling ideas from #109245 to the database details page, enabling non-admin users to use the database details page and providing better transparency to data fetching issues. Errors encountered while fetching table details can be viewed via the tooltip provided by the `Caution` icon at the table's name. `unavailable` cells also provide a tooltip that displays the error impacting that exact cell. Release note (ui change): Non-admin users are able to use the database details page. 110292: c2c: use seperate spanConfigEventStreamSpec in the span config event stream r=stevendanna a=msbutler Previously, the spanConfigEventStream used a streamPartitionSpec, which contained a bunch of fields unecessary for span config streaming. This patch creates a new spanConfigEventStreamSpec which contains the fields only necessary for span config event streaming. Informs #109059 Release note: None 110309: teamcity-trigger: ensure that `race` tag is only passed once r=healthy-pod a=healthy-pod By running under `-race`, the go command defines the `race` build tag for us [1]. Previously, we defined it under `TAGS` to let the issue poster know that this is a failure under `race` and indicate that in the issue. At the time, defining the tag twice didn't cause issues but after #109773, it led to build failures [2]. To reproduce locally: ``` bazel test -s --config=race pkg/util/ctxgroup:all --test_env=GOTRACEBACK=all --define gotags=bazel,gss,race ``` As a follow-up, we should find another way to let the issue poster know that a failure was running under `race`. [1] https://go.dev/doc/articles/race_detector#Excluding_Tests [2] #109994 (comment) Epic: none Release note: None Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Thomas Hardy <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: healthy-pod <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, a call to
shedLease
was made within the process loop and outside of planning, in the replicate queue. This patch moves the shed lease into consider rebalance, where it originally was and converts it into a plannable action.Part of #90141
Release note: None