-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: remove the remaining uses of ReplicaDescriptors.Unwrap #39157
Conversation
First commit is #39151. No hurry on this review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 5 files at r1, 22 of 22 files at r2.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @danhhz)
pkg/storage/allocator_test.go, line 4774 at r2 (raw file):
} func TestAllocatorRemoveLearner(t *testing.T) {
Where'd this go?
pkg/storage/replica_raft_quiesce.go, line 79 at r2 (raw file):
// NB: we know there's a non-nil RaftStatus because internalRaftGroup isn't nil. r.mu.lastUpdateTimes.updateOnUnquiesce( // WIP this All is a guess
It's correct, when the group unquiesces we want to treat all replicas as active for the moment (since they may not have contacted us recently, and we don't want to assume they're all dead).
pkg/storage/replicate_queue.go, line 197 at r2 (raw file):
} func (rq *replicateQueue) computeAction(
Why did you hoist the learner stuff out of the allocator? Not saying this is necessarily bad, but leave a comment somewhere.
pkg/storage/replicate_queue.go, line 275 at r2 (raw file):
if rq.canTransferLease() && rq.allocator.ShouldTransferLease( ctx, zone, rangeInfo.Replicas, lease.Replica.StoreID, desc.RangeID, repl.leaseholderStats) {
Is this right? Add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @danhhz and @tbg)
pkg/storage/replicate_queue.go, line 197 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Why did you hoist the learner stuff out of the allocator? Not saying this is necessarily bad, but leave a comment somewhere.
I was trying to make it obvious within the allocator code that it was only dealing with voters, without littering it with .Replicas().All()
or .Replicas().Voters()
calls, each accompanied by a comment. I don't super-love this refactor, so I'll poke at this again. It's tough because I think this code wants a bit larger of a refactor, but I also don't think it's the right timing for me to do that right now
pkg/storage/replicate_queue.go, line 275 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Is this right? Add a comment.
This is pretty good evidence that I failed in my goals above. I was hoping to make rangeInfo.Replicas
the go to place for replicas in all allocator/replicateQ code, and to have it produced in one place that did the learners check. But you having to ask this question tips me off that it's not as obvious as I was looking for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/storage/allocator_test.go, line 4774 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Where'd this go?
It's back! : - D
pkg/storage/replica_raft_quiesce.go, line 79 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
It's correct, when the group unquiesces we want to treat all replicas as active for the moment (since they may not have contacted us recently, and we don't want to assume they're all dead).
👍
pkg/storage/replicate_queue.go, line 197 at r2 (raw file):
Previously, danhhz (Daniel Harrison) wrote…
I was trying to make it obvious within the allocator code that it was only dealing with voters, without littering it with
.Replicas().All()
or.Replicas().Voters()
calls, each accompanied by a comment. I don't super-love this refactor, so I'll poke at this again. It's tough because I think this code wants a bit larger of a refactor, but I also don't think it's the right timing for me to do that right now
Thanks for pushing back on this. I think this new version is better
pkg/storage/replicate_queue.go, line 320 at r3 (raw file):
voterReplicas := desc.Replicas().Voters() // WIP there's all sorts of `return true, nil` below, which is what it was
It feels like some of these should be false, nil
(noop, rangeunavailable) and some should get comments on why we're retrying ("range of replica %s was identified as having dead replicas, but no dead replicas were found")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 11 of 11 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @danhhz)
pkg/storage/replicate_queue.go, line 320 at r3 (raw file):
Previously, danhhz (Daniel Harrison) wrote…
It feels like some of these should be
false, nil
(noop, rangeunavailable) and some should get comments on why we're retrying ("range of replica %s was identified as having dead replicas, but no dead replicas were found")
You're talking about Noop and RangeUnavailable below, right? Agreed, either in this PR or a separate one.
Closes cockroachdb#37916 Release note: None
TFTR! I starting cleaning the allocator reason stuff up some more, but decided it was better left for a separate PR bors r=tbg |
39157: *: remove the remaining uses of ReplicaDescriptors.Unwrap r=tbg a=danhhz Closes #37916 Release note: None Co-authored-by: Daniel Harrison <[email protected]>
Build succeeded |
This change fixes two bugs: - The first bug is related to the comparison of floating point numbers when considering to rebalance a range. The diversity score of two identical ranges (even the same range) may differ by a small amount. The fix is to consider candidates equal if their diversity score have very close values (within 1e-10). - The second bug has been introduced with cockroachdb#39157 and resulted in ignoring the newly added replica when determining what existing replica it should replace in Allocator.simulateRemoveTarget. As a result - when the first bug was triggering and unnecessary rebalancing was happening - the replica to be dropped was chosen at random (in the case where all existing replica are necessary). For example - let's say a range is constrained to have a replica in each of the regions A,B,C. We are trying to add a new replica in C. Evaluating what should be removed from the A,B,C may choose any one of them. Regardless of the choise, the constraint will be violated. If we however ask the same question for A,B,C(old),C(new) - the result will be C(old) as this will preserve the constraint (to have one replica in each region). Fixes cockroachdb#42715 Fixes https://github.com/cockroachlabs/support/issues/348 Release note (bug fix): the allocator now considers stores with very close diversity scores equal (all other things being the same) and doesn't attempt to rebalance. Release note (bug fix): the allocator considers the new store being added when looking for target in case of rebalance.
This change fixes two bugs: - The first bug is related to the comparison of floating point numbers when considering to rebalance a range. The diversity score of two identical ranges (even the same range) may differ by a small amount. The fix is to consider candidates equal if their diversity score have very close values (within 1e-10). - The second bug has been introduced with cockroachdb#39157 and resulted in ignoring the newly added replica when determining what existing replica it should replace in Allocator.simulateRemoveTarget. As a result - when the first bug was triggering and unnecessary rebalancing was happening - the replica to be dropped was chosen at random (in the case where all existing replica are necessary). For example - let's say a range is constrained to have a replica in each of the regions A,B,C. We are trying to add a new replica in C. Evaluating what should be removed from the A,B,C may choose any one of them. Regardless of the choise, the constraint will be violated. If we however ask the same question for A,B,C(old),C(new) - the result will be C(old) as this will preserve the constraint (to have one replica in each region). Fixes cockroachdb#42715 Fixes https://github.com/cockroachlabs/support/issues/348 Release note (bug fix): the allocator now considers stores with very close diversity scores equal (all other things being the same) and doesn't attempt to rebalance. Release note (bug fix): the allocator considers the new store being added when looking for target in case of rebalance.
This change fixes two bugs: - The first bug is related to the comparison of floating point numbers when considering to rebalance a range. The diversity score of two identical ranges (even the same range) may differ by a small amount. The fix is to consider candidates equal if their diversity score have very close values (within 1e-10). - The second bug has been introduced with cockroachdb#39157 and resulted in ignoring the newly added replica when determining what existing replica it should replace in Allocator.simulateRemoveTarget. As a result - when the first bug was triggering and unnecessary rebalancing was happening - the replica to be dropped was chosen at random (in the case where all existing replica are necessary). For example - let's say a range is constrained to have a replica in each of the regions A,B,C. We are trying to add a new replica in C. Evaluating what should be removed from the A,B,C may choose any one of them. Regardless of the choise, the constraint will be violated. If we however ask the same question for A,B,C(old),C(new) - the result will be C(old) as this will preserve the constraint (to have one replica in each region). Fixes cockroachdb#42715 Fixes https://github.com/cockroachlabs/support/issues/348 Release note (bug fix): the allocator now considers stores with very close diversity scores equal (all other things being the same) and doesn't attempt to rebalance. Release note (bug fix): the allocator considers the new store being added when looking for target in case of rebalance.
43041: allocator: fix incorrect rebalance signal and target r=darinpp a=darinpp This change fixes two bugs: - The first bug is related to the comparison of floating point numbers when considering to rebalance a range. The diversity score of two identical ranges (even the same range) may differ by a small amount. The fix is to consider candidates equal if their diversity score have very close values (within 1e-10). - The second bug has been introduced with #39157 and resulted in ignoring the newly added replica when determining what existing replica it should replace in Allocator.simulateRemoveTarget. As a result - when the first bug was triggering and unnecessary rebalancing was happening - the replica to be dropped was chosen at random (in the case where all existing replica are necessary). For example - let's say a range is constrained to have a replica in each of the regions A,B,C. We are trying to add a new replica in C. Evaluating what should be removed from the A,B,C may choose any one of them. If the choice is A or B, the constraint will be violated. If we however ask the same question for A,B,C(old),C(new) - the result will be C(old) as this will preserve the constraint (to have one replica in each region). Fixes #42715 Fixes https://github.com/cockroachlabs/support/issues/348 Release note (bug fix): the allocator now considers stores with very close diversity scores equal (all other things being the same) and doesn't attempt to rebalance. Release note (bug fix): the allocator considers the new store being added when looking for target in case of rebalance. Co-authored-by: Darin <[email protected]>
This change fixes two bugs: - The first bug is related to the comparison of floating point numbers when considering to rebalance a range. The diversity score of two identical ranges (even the same range) may differ by a small amount. The fix is to consider candidates equal if their diversity score have very close values (within 1e-10). - The second bug has been introduced with cockroachdb#39157 and resulted in ignoring the newly added replica when determining what existing replica it should replace in Allocator.simulateRemoveTarget. As a result - when the first bug was triggering and unnecessary rebalancing was happening - the replica to be dropped was chosen at random (in the case where all existing replica are necessary). For example - let's say a range is constrained to have a replica in each of the regions A,B,C. We are trying to add a new replica in C. Evaluating what should be removed from the A,B,C may choose any one of them. Regardless of the choise, the constraint will be violated. If we however ask the same question for A,B,C(old),C(new) - the result will be C(old) as this will preserve the constraint (to have one replica in each region). Fixes cockroachdb#42715 Fixes cockroachlabs/support#348 Release note (bug fix): the allocator now considers stores with very close diversity scores equal (all other things being the same) and doesn't attempt to rebalance. Release note (bug fix): the allocator considers the new store being added when looking for target in case of rebalance.
This change fixes two bugs: - The first bug is related to the comparison of floating point numbers when considering to rebalance a range. The diversity score of two identical ranges (even the same range) may differ by a small amount. The fix is to consider candidates equal if their diversity score have very close values (within 1e-10). - The second bug has been introduced with #39157 and resulted in ignoring the newly added replica when determining what existing replica it should replace in Allocator.simulateRemoveTarget. As a result - when the first bug was triggering and unnecessary rebalancing was happening - the replica to be dropped was chosen at random (in the case where all existing replica are necessary). For example - let's say a range is constrained to have a replica in each of the regions A,B,C. We are trying to add a new replica in C. Evaluating what should be removed from the A,B,C may choose any one of them. Regardless of the choise, the constraint will be violated. If we however ask the same question for A,B,C(old),C(new) - the result will be C(old) as this will preserve the constraint (to have one replica in each region). Fixes #42715 Fixes cockroachlabs/support#348 Release note (bug fix): the allocator now considers stores with very close diversity scores equal (all other things being the same) and doesn't attempt to rebalance. Release note (bug fix): the allocator considers the new store being added when looking for target in case of rebalance.
This was unnecessary due to the `action != AllocatorConsiderRebalance` branch immediately below it. It was introduced in cockroachdb#39157 during a refactor (see "I was trying to make it obvious" in comments) that was subsequently reverted.
67450: colexec: implement vectorized index join r=DrewKimball a=DrewKimball This patch provides a vectorized implementation of the index join operator. Span generation is accomplished using two utility operators. `spanEncoder` operates on a single index key column and fills a `Bytes` column with the encoding of that column for each row. `spanAssembler` takes the output of each `spanEncoder` and generates spans, accounting for table/index prefixes and possibly splitting the spans over column families. Finally, the `ColIndexJoin` operator uses the generated spans to perform a lookup on the table's primary index, returns all batches resulting from the lookup, and repeats until the input is fully consumed. The `ColIndexJoin` operator queues up input rows until the memory footprint of the rows reaches a preset limit (default 4MB for parity with the row engine). This allows the cost of starting a scan to be amortized. Fixes #65905 Release note (sql change): The vectorized execution engine can now perform a scan over an index, and then join on the primary index to retrieve the required columns. 68620: sql: reimplement sqlstats API using iterator r=Azhng a=Azhng As we move to implement an aggregated virtual table of in-memory sqlstats and persisted sqlstats, current SQL Stats API makes it difficult to implement such behaviour. This commit introduces lower level iterator APIs that can be used later for the aggregated virtual table. Release note: None 68700: kv: remove special handling of learners in replicateQueue.shouldQueue r=nvanbenschoten a=nvanbenschoten This was unnecessary due to the `action != AllocatorConsiderRebalance` branch immediately below it. It was introduced in #39157 during a refactor (see "I was trying to make it obvious" in comments) that was subsequently reverted. Co-authored-by: Drew Kimball <[email protected]> Co-authored-by: Azhng <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>
Closes #37916
Release note: None