-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: replace the rule solver #12165
Conversation
7ca73d8
to
324f16f
Compare
Review status: 0 of 4 files reviewed at latest revision, 13 unresolved discussions, all commit checks successful. pkg/storage/allocator.go, line 362 at r8 (raw file):
You probably have a reason in mind. Add it to the comment. pkg/storage/allocator.go, line 392 at r8 (raw file):
This will crash if pkg/storage/rule_solver.go, line 33 at r8 (raw file):
pkg/storage/rule_solver.go, line 37 at r8 (raw file):
pkg/storage/rule_solver.go, line 96 at r8 (raw file):
You should add a comment to all of these pkg/storage/rule_solver.go, line 115 at r8 (raw file):
pkg/storage/rule_solver.go, line 127 at r8 (raw file):
pkg/storage/rule_solver.go, line 156 at r8 (raw file):
Return a pkg/storage/rule_solver.go, line 164 at r8 (raw file):
This will crash if pkg/storage/rule_solver.go, line 183 at r8 (raw file):
Ditto crash mentioned above. pkg/storage/rule_solver.go, line 234 at r8 (raw file):
pkg/storage/rule_solver.go, line 254 at r8 (raw file):
What does it mean for pkg/storage/rule_solver.go, line 439 at r8 (raw file):
Does this need to return a value between 0 and 1 now? You're randomly selecting from the candidates with the same constraint score and then only comparing this value against itself. Comments from Reviewable |
324f16f
to
a0c773d
Compare
Review status: 0 of 4 files reviewed at latest revision, 13 unresolved discussions, some commit checks failed. pkg/storage/allocator.go, line 362 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/allocator.go, line 392 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
fixed pkg/storage/rule_solver.go, line 33 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/rule_solver.go, line 37 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/rule_solver.go, line 96 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
They already said "sorted", but I've added a "(by score reversed)" to them. pkg/storage/rule_solver.go, line 115 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
you meant pkg/storage/rule_solver.go, line 127 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/rule_solver.go, line 156 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. Actually, changed this and selectBad to return a pointer to a storeDescriptor. pkg/storage/rule_solver.go, line 164 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
fixed pkg/storage/rule_solver.go, line 183 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
fixed pkg/storage/rule_solver.go, line 234 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/rule_solver.go, line 254 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
If we consider that we want to converge on toward mean when making allocation decisions, than it is a constraint (just not a formal constraint from our zone configs). This allows us to use the criteria from shouldRebalance as part of ordering the candidates to make this decision in one place with full knowlege. pkg/storage/rule_solver.go, line 439 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
It doesn't have to be between 0 and 1, but there's no reason not no. We want it to be higher for more empty replicas. It would be strange to order by constraint score descending and capacity score ascending. Comments from Reviewable |
Review status: 0 of 4 files reviewed at latest revision, 5 unresolved discussions, some commit checks failed. pkg/storage/allocator.go, line 392 at r8 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
I might be missing the fix, but it still looks like this will crash if pkg/storage/rule_solver.go, line 254 at r8 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
Should this be pkg/storage/rule_solver.go, line 439 at r8 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
It's also a little strange to be forcing the score into the range 0 to 1. And in order to do that you need to use pkg/storage/rule_solver.go, line 169 at r11 (raw file):
pkg/storage/rule_solver.go, line 191 at r11 (raw file):
See above comment about using a pointer for Comments from Reviewable |
a0c773d
to
b51aaef
Compare
Review status: 0 of 4 files reviewed at latest revision, 5 unresolved discussions, some commit checks pending. pkg/storage/allocator.go, line 392 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
Ah, I fixed a different one. pkg/storage/rule_solver.go, line 254 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
We actually want the opposite. SelectBad in the balancer calls that and uses it to identify preferred candidates for removal. In this case, we want the worse candidates to have a lower score. I considered removing 1 form the constraint score, but I was worried about the interaction it might have with positive zone config constraints or diversity scores, both of which are always positive. pkg/storage/rule_solver.go, line 439 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
I'd be down for changing this up (and the already strange sort reverseness), but in a follow up PR. If we do come up with a collection of metrics to determine sizes, I'd rather make them range from -1 to 1 which more closely matches ML style scores. pkg/storage/rule_solver.go, line 169 at r11 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/rule_solver.go, line 191 at r11 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. Comments from Reviewable |
Review status: 0 of 4 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/storage/rule_solver.go, line 439 at r8 (raw file):
Why is that important? pkg/storage/rule_solver.go, line 260 at r14 (raw file):
Ah, so should this be Comments from Reviewable |
- Splits the scores returned in the rule solver into a constraint and balance scores. - Add a valid field to constraints and add it to all rules. - Solve now returns all candidates instead of just the valid ones. To get only the valid candidates, the new function onlyValid and new type condidateList have also been added. - This allows us to use solve for removeTarget. It also cleans up the logic in removeTarget to more closely match the non-rule solver version. - Split the capcity rules into two rules. They were performing two different operations and didnt' make sense being combined. This will also ease the change of converting the rules to basic functions. Part of cockroachdb#10275
This commit adds the equivalent of the current selectGood and selectBad so that the rule solver will also use the "power of two random choices" method that is currently used by the balancer.
b51aaef
to
b19ec35
Compare
Review status: 0 of 4 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/storage/rule_solver.go, line 439 at r8 (raw file): Previously, petermattis (Peter Mattis) wrote…
It's not, but since it's more or less a standard, we might be able to leverage some basic ML techniques if we have a number of criteria in need of weighting. I'll send a follow up PR that cleans this up. pkg/storage/rule_solver.go, line 260 at r14 (raw file): Previously, petermattis (Peter Mattis) wrote…
Ok, so I had to step away from this and re-examine exactly how this should work. I think I've got it. Please let me know if my logic is flawed here. First of all, With that in mind: Removals: What we actually want is Rebalance: For the existing candidates, we want Comments from Reviewable |
As you probably realize, this PR is a beast which makes it very likely something has slipped through. I don't see a good way to mitigate for this PR, but in the future smaller PRs are better: easier on the reviewer and better reviewed code for the author. Review status: 0 of 4 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. pkg/storage/rule_solver.go, line 439 at r8 (raw file):
That's way too speculative to use as a basis for a change. When we do have a need to weight a number of criteria we should revisit, but not before. pkg/storage/rule_solver.go, line 260 at r17 (raw file):
This is somewhat confusing and deserves a comment: pkg/storage/rule_solver.go, line 312 at r17 (raw file):
Ditto my other comment. This one can be a reference to the comment in pkg/storage/rule_solver.go, line 325 at r17 (raw file):
Adding a constant Comments from Reviewable |
Remove the concept of rules and replace it with clean direct functions.
b19ec35
to
bbe17fa
Compare
Review status: 0 of 4 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful. pkg/storage/rule_solver.go, line 260 at r17 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/rule_solver.go, line 312 at r17 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. pkg/storage/rule_solver.go, line 325 at r17 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. Comments from Reviewable |
The 1st commit is a collection of fixes and updates to the rule solver.
The 2nd commit adds back in the randomness that was missing from the rule solver.
The 3rd commit removes the rule solver altogether and replaces it with a collection of functions for the rules and 3 functions for allocation, removal and rebalancing.
There's more work to do here after merging, but this will bring us close enough to start testing rebalancing using this more expressive system.
With this PR #11702 and #11721 can be closed.
This change is