Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed Raft voter priority override with single replica topics #10800

Merged

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented May 16, 2023

Redpanda Raft implementation exposes an API allowing to override a voter
priority. This is used by the drain manager when a node is in
maintenance mode. In current implementation when the only voter is in
maintenance mode the Raft group is not able to elect a leader as the
reported priority it to low (the priority override in maintenance is set to 0).

Fixed Raft implementation to make sure that it prioritize an
availability over the user priority preference. If a node is the only
voter the priority override is ignored.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Bug Fixes

  • Fixed not being able to elect a leader in situation when only voter is in maintenance mode

ztlpn
ztlpn previously approved these changes May 16, 2023
tests/rptest/tests/maintenance_test.py Outdated Show resolved Hide resolved

target = random.choice(self.redpanda.nodes)

self._enable_maintenance(target)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, what will happen with the maintenance status in this case? Presumably the operator will wait for the node to become fully drained before rebooting it and this will fail because leaders for single-replica topics have nowhere to move.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drain manager doesn't care about leaders of single replica partitions. It is the same for the case where a maintenance mode is enabled and node wasn't restarted - single replica partition leaders will stay in place.

@bharathv
Copy link
Contributor

Patch looks fine but I'm wondering how the only replica ended up leaderless in the first place, we have checks against it? The sequence of actions in drain_manager is..

  1. block_new_leadership -- override voter priority to 0
  2. transfer_leadership -- find another replica to be the leader.

(2) always returns error, no?

@mmaslankaprv
Copy link
Member Author

Patch looks fine but I'm wondering how the only replica ended up leaderless in the first place, we have checks against it? The sequence of actions in drain_manager is..

  1. block_new_leadership -- override voter priority to 0
  2. transfer_leadership -- find another replica to be the leader.

(2) always returns error, no?

Node restart is critical in this case

@mmaslankaprv mmaslankaprv force-pushed the fix-single-replica-maintence-mode branch from 0bb9778 to ce6d675 Compare May 22, 2023 17:22
bharathv
bharathv previously approved these changes May 22, 2023
Redpanda Raft implementation exposes an API allowing to override a voter
priority. This is used by the drain manager when a node is in
maintenance mode. In current implementation when the only voter is in
maintenance mode the Raft group is not able to elect a leader as the
reported priority it to low (the priority override in maintenance is set to 0).

Fixed Raft implementation to make sure that it prioritize an
availability over the user priority preference. If a node is the only
voter the priority override is ignored.

Fixes: redpanda-data/cloudv2#6174

Signed-off-by: Michal Maslanka <[email protected]>
Since now leader is elected earlier there is a race condition in
updating Health Report when a single node starts. Made the timeout
longer to allow the `feature_manager` to retry activating cluster
version.

Signed-off-by: Michal Maslanka <[email protected]>
@mmaslankaprv mmaslankaprv force-pushed the fix-single-replica-maintence-mode branch from 241108f to 5ce6ef8 Compare June 16, 2023 14:11
@mmaslankaprv
Copy link
Member Author

ci failure: #11454

@mmaslankaprv mmaslankaprv merged commit 432cee6 into redpanda-data:dev Jun 20, 2023
@mmaslankaprv mmaslankaprv deleted the fix-single-replica-maintence-mode branch June 20, 2023 08:07
@vbotbuildovich
Copy link
Collaborator

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

/backport v22.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v22.2.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-10800-v22.2.x-206 remotes/upstream/v22.2.x
git cherry-pick -x 1b21f8372646133d1e0bf1015a5fd5ec255e4abe 04191fb403b35121ded7ac3ecc75b10df96c1813 5ce6ef842fed8c207a901c9b76d21668f4b5d15b

Workflow run logs.

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the commands below:

git checkout -b backport-pr-10800-v22.3.x-205 remotes/upstream/v22.3.x
git cherry-pick -x 1b21f8372646133d1e0bf1015a5fd5ec255e4abe 04191fb403b35121ded7ac3ecc75b10df96c1813 5ce6ef842fed8c207a901c9b76d21668f4b5d15b

Workflow run logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants