Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpk: add unsafe partition recovery #14744

Closed
r-vasquez opened this issue Nov 3, 2023 · 0 comments · Fixed by #15300
Closed

rpk: add unsafe partition recovery #14744

r-vasquez opened this issue Nov 3, 2023 · 0 comments · Fixed by #15300
Labels
area/rpk kind/enhance New feature or request

Comments

@r-vasquez
Copy link
Contributor

Who is this for and what problem do they have today?

This command will allow the user to remove a set of permanently lost nodes from a cluster as cleanly as possible to allow the cluster to recover.

Relevant Core PR: #13943

Proposed Command

rpk cluster partitions unsafe-recover --from-nodes 1,2,3 [--no-confirm] [--dry][--format]
  • --from-nodes is a required flag. These are the nodes the user is trying unsafely to recover partitions from.
  • --no-confirm lets the user skip the plan confirmation and perform the unsafe recovery.
  • --dry will only print the plan.
  • --format will print the output in JSON/YAML.

The idea is to use the Admin API to perform the operation:

We can use GET /v1/debug/force_replicas_from_nodes to get the move plan and display it in our standard table format.

$ rpk cluster partitions unsafe-recover --from-nodes 1,2 

NAMESPACE  TOPIC  PARTITION  CURRENT-NODE:REPLICA  TARGET-NODE:REPLICAS  INFO
kafka      foo    231        [1:22, 2:11]          [3:1, 3:2]

And then prompt the user to confirm and apply the plan.

Additional notes

GET /v1/debug/force_replicas_from_nodes and POST /v1/debug/force_replicas_from_nodes endpoints are being added in #13943

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rpk kind/enhance New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant