Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: ranges don't get upreplicated, despite other nodes being around #47620

Closed
irfansharif opened this issue Apr 17, 2020 · 2 comments
Closed
Assignees
Labels
A-kv Anything in KV that doesn't belong in a more specific category. A-kv-distribution Relating to rebalancing and leasing. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@irfansharif
Copy link
Contributor

Saw the following take place out in wild.

image

The cluster had lost a node, and the operator had added different one to it. We had then replicated all ranges except for about 4 of them. 3 of them continued to stay under-replicated when checking in about two weeks later. What I suspect is happening here is that we're not upreplicating quiesced ranges (the fourth range may have seen some activity in the interim causing it to upreplicate).

Somewhat surprisingly, and perhaps orthogonal to this issue, running the under-replicated range through our enqueue range queues didn't actually do anything. Are we simply ignore quiesced ranges there? I don't think we should be.

We should sanity check what our behavior is here, we end up in a pretty fragile state running the scenario above.

(The operator was running v19.2.3)

@irfansharif
Copy link
Contributor Author

+cc #44206.

@irfansharif irfansharif added A-kv Anything in KV that doesn't belong in a more specific category. A-kv-distribution Relating to rebalancing and leasing. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Apr 17, 2020
@irfansharif irfansharif self-assigned this Apr 17, 2020
@irfansharif
Copy link
Contributor Author

irfansharif commented Apr 17, 2020

Nevermind here, we're already doing the right thing. It was an operator error where they were able to specify zone configs that weren't solvable given their topology. It's still interesting how this occurred from an observability stand point, and the fact that they weren't able to deduce what was happening. This is also what affected their manual enqueuing of things, because there weren't any allowable nodes to upreplicate to. Their current state of things still doesn't satisfy their zone configs, notes below.


This user had four nodes, but were "allowed" to set up replication constraints like below:

  • num_replicas = 3, constraints = '[+region=US]'
  • num_replicas = 3, constraints = '[+region=EUR]'
  • num_replicas = 3, constraints = '[+region=ASIA]'

They had two nodes assigned to EUR (n3, n4), one node assigned to US (n2), and one node assigned to ASIA (n1).

The reason they were observing only a few under-replicated ranges, as opposed to, well, all of them, is I think due to the fact that the constraints were added into a previously unconstrained cluster with existing user data. So when for example the user added the EUR constraint, for a particular range, two of it's replicas were allowed to move towards n3, n4. But the "remaining" replica currently in violation of the zone constraints (and sitting in either n1 or n2) is effectively stuck there, with nowhere to go (there are no constraint conforming nodes it can balance onto).

In this particular cluster, there were no ASIA ranges, AFAICT. The under-replicated ranges were however US ones, and similar to above, one replica of a US designated range would find itself in n2, but the remaining ones would effectively be "stuck" in perpetual violation of the zone configs set up. Given that the user in this issue, previously lost a node, what I suspect happened is that the lost node had replicas for US ranges. Now, despite adding newer nodes to the system, we're unable to upreplicate things because there's no valid site to replicate to. The user is now in a fragile state, and it's not clear we're surfacing why in the UI. As far as KV is concerned, I think we're "doing the right thing" in refusing to upreplicate. But as far as multi-region UX is concerned, this seems like a pretty sharp edge for us to expose and have users trip themselves over. Closing this out, but sharing internally. Perhaps follow up issues can be made here.

+cc @piyush-singh.

@irfansharif irfansharif changed the title kv: quiesced ranges don't get upreplicated kv: ranges don't get upreplicated, despite other nodes being around Apr 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv Anything in KV that doesn't belong in a more specific category. A-kv-distribution Relating to rebalancing and leasing. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

1 participant