-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: pro-actively enqueue less preferred leases into lease queue #120441
Conversation
02bea05
to
54dd22d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the one question about the excludeReplsInNeedOfSnap
change
Reviewed all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @kvoli)
pkg/cmd/roachtest/tests/lease_preferences.go
line 115 at r1 (raw file):
checkNodes: []int{1, 3, 4, 5}, eventFn: makeStopNodesEventFn(2 /* targets */), waitForLessPreferred: true,
nit: I think you should just remove this parameter now. I don't think the only remaining test that uses it matters anymore. Also can we decrease the postEventWaitDuration on all these tests now that we expect them to move eagerly? Otherwise the 10m is long enough for the regular queue to process them so it doesn't feel like it is testing the eager enqueueing.
pkg/cmd/roachtest/tests/lease_preferences.go
line 121 at r1 (raw file):
}) r.Add(registry.TestSpec{ // NB: This test takes down 2(/2) nodes in the most preferred locality. Th
nit: Th -> The
pkg/kv/kvserver/replica_proposal.go
line 570 at r1 (raw file):
"acquired lease is less preferred, enqueuing for transfer [lease=%v preferences=%v]", newLease, r.mu.conf.LeasePreferences) r.store.leaseQueue.AddAsync(ctx, r, allocatorimpl.TransferLeaseForPreferences.Priority()-1)
nit: Can you create two different preference enum values rather than just subtracting 1 (even if you just use the value 299 in the code). I think it will be more clear and also potentially we will separate these more in the future.
pkg/kv/kvserver/allocator/plan/lease.go
line 170 at r1 (raw file):
repl, existingVoters, false, /* excludeReplsInNeedOfSnap */
I'm confused by this change. Can you clarify why we want to send this to replicas needing a snapshot? I think that is almost certainly going to fail later on.
pkg/kv/kvserver/lease_queue_test.go
line 306 at r1 (raw file):
const numNodes = 3 const preferredNode = 1 zcfg := zonepb.DefaultZoneConfig()
nit: :( - I was trying to get rid of using the DefaultZoneConfigOverride, but I will just add this to my list when I clean these up. I don't think there is a good alternative today other than setting this through SQL, so this is fine for now. I'll prioritize this cleanup soon.
pkg/kv/kvserver/lease_queue_test.go
line 341 at r1 (raw file):
Knobs: knobs, Locality: roachpb.Locality{ Tiers: []roachpb.Tier{{Key: "rack", Value: fmt.Sprintf("%d", i+1)}},
nit: You didn't need to do all this. We create default localities with each node have a locality of region:text, dc:dc. Then you won't need any ServerArgsPerNode.
54dd22d
to
d1ebc12
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TYFTR!
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @andrewbaptist and @kvoli)
pkg/cmd/roachtest/tests/lease_preferences.go
line 115 at r1 (raw file):
The parameter is still using it is required -- the entire first region is being removed so the best option is the less preferred lease.
Also can we decrease the postEventWaitDuration on all these tests now that we expect them to move eagerly? Otherwise the 10m is long enough for the regular queue to process them so it doesn't feel like it is testing the eager enqueueing.
Good idea, lowered it to 5 minutes.
pkg/cmd/roachtest/tests/lease_preferences.go
line 121 at r1 (raw file):
Previously, andrewbaptist (Andrew Baptist) wrote…
nit: Th -> The
Updated.
pkg/kv/kvserver/replica_proposal.go
line 570 at r1 (raw file):
Previously, andrewbaptist (Andrew Baptist) wrote…
nit: Can you create two different preference enum values rather than just subtracting 1 (even if you just use the value 299 in the code). I think it will be more clear and also potentially we will separate these more in the future.
The issue is that we don't differentiate between less preferred and violating when recommending a lease transfer in the allocator, if there is an improvement in preferences rank we are satisfying, we recommend TransferLeaseForPreferences
. Adding a new enum would require also enumerating that difference in the allocator. I'll leave it as a TODO.
pkg/kv/kvserver/allocator/plan/lease.go
line 170 at r1 (raw file):
Previously, andrewbaptist (Andrew Baptist) wrote…
I'm confused by this change. Can you clarify why we want to send this to replicas needing a snapshot? I think that is almost certainly going to fail later on.
We don't. Its the case described in the comment above:
// If we don't find a suitable target, but we own a lease violating the
// lease preferences, and there is a more suitable target, return an error
// to place the replica in purgatory and retry sooner. This typically
// happens when we've just acquired a violating lease and we eagerly
// enqueue the replica before we've received Raft leadership, which
// prevents us from finding appropriate lease targets since we can't
// determine if any are behind.
This will place the replica into purgatory so the lease transfer may be retried. Not excluding replicas in need of a snapshot is to determine this, there's no change in logic w.r.t this property. The change is to include all replicas where there exists a better leaseholder target according to preferences (less preferred), as opposed to just violating.
pkg/kv/kvserver/lease_queue_test.go
line 306 at r1 (raw file):
Previously, andrewbaptist (Andrew Baptist) wrote…
nit: :( - I was trying to get rid of using the DefaultZoneConfigOverride, but I will just add this to my list when I clean these up. I don't think there is a good alternative today other than setting this through SQL, so this is fine for now. I'll prioritize this cleanup soon.
Sounds good.
pkg/kv/kvserver/lease_queue_test.go
line 341 at r1 (raw file):
Previously, andrewbaptist (Andrew Baptist) wrote…
nit: You didn't need to do all this. We create default localities with each node have a locality of region:text, dc:dc. Then you won't need any ServerArgsPerNode.
Ah nice, updated to use these instead.
e1fc5b9
to
9c42da7
Compare
Leases are checked against lease preferences after application, when a lease violates the applied preferences it is enqueued into the lease queue. Leases may also satisfy some preference but not the first one, in which case they are considered less preferred. Less preferred leases previously needed to wait for the replica scanner to enqueue them into the lease queue before the lease would be considered to be moved to the first preference. Enqueue less preferred leases after application, similar to leases violating applied preferences. Resolves: cockroachdb#116081 Release note: None
9c42da7
to
ba13d19
Compare
bors r=andrewbaptist |
Leases are checked against lease preferences after application, when a
lease violates the applied preferences it is enqueued into the lease
queue. Leases may also satisfy some preference but not the first one, in
which case they are considered less preferred.
Less preferred leases previously needed to wait for the replica scanner
to enqueue them into the lease queue before the lease would be
considered to be moved to the first preference.
Enqueue less preferred leases after application, similar to leases
violating applied preferences.
Resolves: #116081
Release note: None