storage: Lease transfer throttling can lead to bad leaseholder balance after restart #19355

a-robinson · 2017-10-18T21:50:12Z

Repro steps:

Build up a 3 node cluster with a lot (at least thousands) of ranges (as in Severe performance degradation when shutting one of 3 nodes down. #19236).
Stop generating load
Restart all nodes
Direct uniformly distributed load at a single node
Watch as all the leases get acquired by that one node and take hours to distribute more evenly

The reason all the leases get acquired by the node that the load is hitting is because we don't eagerly acquire leases for range, we only take them as required by incoming requests. Since all the incoming requests are going to the same node, that node ends up taking all the leases.

We throttle lease transfers to one per second, which means that transferring thousands of leases literally takes hours. Transferring 100k leases would take more than a day. In practice, I've been seeing a steady stream of 0.9 lease transfers per second on my cluster, going from 16k to 10.75k in about 95 minutes.

Overloading all the leaseholder work onto this one node has real repercussions for the performance of the cluster -- my cluster gained a couple hundred qps (~5%) back as the leases have been spreading out.

The right approach here may be to go back to one of @petermattis's recurring suggestions of changing our rate limiting based on how far away we are from the desired state. When one node has significantly more leases than another, don't throttle lease transfers nearly as much as when the nodes are relatively more balanced.

We could also take a simpler approach, and just not rate limit lease transfers for the first N seconds that a node is running. I think the above approach will be beneficial in other scenarios as well, though, like when just one node in the cluster restarts and we'd like to get leases back onto it more quickly.

nvanbenschoten · 2018-10-01T14:09:12Z

@a-robinson mind updating this issue and mentioning how load-based lease rebalancing plays into this? Is that able to rebalance leases at a higher frequency than once per second? Should it be able to if it notices large load imbalances?

I assume there's nothing more to do here in 2.1, right?

a-robinson · 2018-10-01T17:00:37Z

You're right that load-based lease rebalancing does not obey the 1 second throttling, and in cases like this there's a good chance it'll solve the problem for us. I'll try repro'ing.

a-robinson · 2018-10-01T18:09:22Z

Well look at that:

The green spikes in the bottom graph are load-based lease transfers.

I think we can consider this complete. Without much load on the system, the leases may take longer to balance, but it also probably won't be affecting performance in most such cases.

Fixed by #28340

a-robinson added C-performance Perf of queries or internals. Solution not expected to change functional behavior. S-1-stability Severe stability issues that can be fixed by upgrading, but usually don’t resolve by restarting labels Oct 18, 2017

tbg added the A-kv-distribution Relating to rebalancing and leasing. label May 15, 2018

tbg added this to the 2.1 milestone Jul 22, 2018

a-robinson closed this as completed Oct 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: Lease transfer throttling can lead to bad leaseholder balance after restart #19355

storage: Lease transfer throttling can lead to bad leaseholder balance after restart #19355

a-robinson commented Oct 18, 2017

nvanbenschoten commented Oct 1, 2018

a-robinson commented Oct 1, 2018

a-robinson commented Oct 1, 2018

storage: Lease transfer throttling can lead to bad leaseholder balance after restart #19355

storage: Lease transfer throttling can lead to bad leaseholder balance after restart #19355

Comments

a-robinson commented Oct 18, 2017

nvanbenschoten commented Oct 1, 2018

a-robinson commented Oct 1, 2018

a-robinson commented Oct 1, 2018