storage: Test framework for simulating allocator in different cluster configurations #19131
Labels
A-kv-distribution
Relating to rebalancing and leasing.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
E-intermediate
Intermediate complexity, needs a contributor with 3-6 months of past contribution experience.
We currently test the allocator's decisions in a few manually specified configurations in unit tests, and very few of these do anything interesting with large clusters or multi-locality clusters. More complicated testing has to be done manually with
allocsim
, which means it doesn't get done often enough of on diverse enough configurations, leading to the possibility of user bug reports like #19013.We would likely benefit from a simulator like we have for gossip that ensures convergence, i.e. no thrashing. To be of the most benefit, the simulator could randomly generate a configuration of x nodes, spread throughout y localities, with randomly assigned numbers (and sizes) of replicas to start out. The cluster would then be expected to converge and stop rebalancing after some time.
Not required for, but would be nice to help validate #17979.
The text was updated successfully, but these errors were encountered: