Skip to content

Commit

Permalink
kvserver: don't assign closed timestamps to lease requests
Browse files Browse the repository at this point in the history
Before this patch, RequestLeaseRequests was carrying the lease start
time as a closed timestamp. This matched the behavior we used to have
in 20.2 where the lease start time was considered to be closed by fiat.
On the surface, closing a lease's start time sounds reasonable since, if
the proposing replica gets the lease, it will not evaluate writes at
lower timestamps. Unfortunately, there's a problem: while it's true that
the replica, and the range in general, will not permit writes at
timestamps below this lease's start time, it might happen that the range
is in the process of merging with its left neighbor. If this range has
already been Subsumed as the RHS of a merge then, after merge, the joint
range will allow writes to the former RHS's key space at timestamps
above the RHS's freeze start (which is below the start time the lease in
question). Thus, if the lease were to close its start timestamp while
subsumed, then it'd be possible for follower reads to be served before
the merge finalizes at timestamps that would become un-closed after the
merge.

This patch fixes the bug by having lease requests no carry closed
timestamps any more. Since the hazard involves subsumed ranges, we could
make a distinction between brand new leases for subsumed ranges versus
other brand new leases, and let the former category close the lease
start time. But, for simplicity, we don't close timestamps on any lease
requests.

Release note: None
  • Loading branch information
andreimatei committed May 26, 2021
1 parent 128c1e6 commit 2d9072d
Show file tree
Hide file tree
Showing 4 changed files with 84 additions and 94 deletions.
8 changes: 7 additions & 1 deletion pkg/kv/kvserver/replica_closedts_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -416,7 +416,13 @@ func TestBumpSideTransportClosed(t *testing.T) {

// Test that a lease proposal that gets rejected doesn't erroneously dictate the
// closed timestamp of further requests. If it would, then writes could violate
// that closed timestamp. The tricky scenario tested is the following:
// that closed timestamp.
//
// NOTE: This test was written back when lease requests were closing the lease
// start time. This is no longer true; currently lease requests don't carry a
// closed timestamp. Still, we leave the test as a regression test.
//
// The tricky scenario tested is the following:
//
// 1. A lease held by rep1 is getting close to its expiration.
// 2. Rep1 begins the process of transferring its lease to rep2 with a start
Expand Down
9 changes: 0 additions & 9 deletions pkg/kv/kvserver/replica_follower_read.go
Original file line number Diff line number Diff line change
Expand Up @@ -166,15 +166,6 @@ func (r *Replica) maxClosedRLocked(
lease.Replica.NodeID, r.RangeID, ctpb.Epoch(lease.Epoch), appliedLAI)
maxClosed.Forward(initialMaxClosed)

// If the range has not upgraded to the new closed timestamp system,
// continue using the lease start time as an input to the range's closed
// timestamp. Otherwise, ignore it. We expect to delete this code soon, but
// we keep it around for now to avoid a regression in follower read
// availability in mixed v20.2/v21.1 clusters.
if raftClosed.IsEmpty() {
maxClosed.Forward(lease.Start.ToTimestamp())
}

// Look at the "new" closed timestamp propagation mechanism.
maxClosed.Forward(raftClosed)
maxClosed.Forward(sideTransportClosed)
Expand Down
155 changes: 74 additions & 81 deletions pkg/kv/kvserver/replica_proposal_buf.go
Original file line number Diff line number Diff line change
Expand Up @@ -716,88 +716,81 @@ func (b *propBuf) assignClosedTimestampToProposalLocked(
return nil
}

// For lease requests, we make a distinction between lease extensions and
// brand new leases. Brand new leases carry a closed timestamp equal to the
// lease start time. Lease extensions don't get a closed timestamp. This is
// because they're proposed without a MLAI, and so two lease extensions might
// commute and both apply which would result in a closed timestamp regression.
// The command application side doesn't bother protecting against such
// regressions.
// Lease transfers behave like regular proposals - they get a closed timestamp
// based on closedTSTarget. Note that transfers carry a summary of the
// timestamp cache, so the new leaseholder will be aware of all the reads
// performed by the previous leaseholder.
// Lease requests don't carry closed timestamps. The reason for this differ
// between lease extensions and brand new leases:
// - A lease extension cannot carry a closed timestamp assigned in the same
// way as we do for regular proposal because they're proposed without a MLAI,
// and so two lease extensions might commute and both apply, which would
// result in a closed timestamp regression when the reordered extension
// applies. The command application side doesn't bother protecting against
// such regressions. Besides, the considerations for brand new leases below
// also apply.
// - For a brand new lease, one might think that the lease start time can be
// considered a closed timestamp(*) since, if this replica gets the lease, it
// will not evaluate writes at lower timestamps. Unfortunately, there's a
// problem: while it's true that this replica, and this range in general, will
// not permit writes at timestamps below this lease's start time, it might
// happen that the range is in the process of merging with its left neighbor.
// If this range has already been Subsumed as the RHS of a merge then, after
// merge, the joint range will allow writes to the former RHS's key space at
// timestamps above the RHS's freeze start (which is below the start time of
// this lease). Thus, if this lease were to close its start timestamp while
// subsumed, then it'd be possible for follower reads to be served before the
// merge finalizes at timestamps that would become un-closed after the merge.
// Since this scenario is about subsumed ranges, we could make a distinction
// between brand new leases for subsumed ranges versus other brand new leases,
// and let the former category close the lease start time. But, for
// simplicity, we don't close timestamps on any lease requests.
//
// As opposed to lease requests, lease transfers behave like regular
// proposals: they get a closed timestamp based on closedTSTarget. Note that
// transfers carry a summary of the timestamp cache, so the new leaseholder
// will be aware of all the reads performed by the previous leaseholder.
//
// (*) If we were to close the lease start time, we'd have to forward the
// lease start time to b.assignedClosedTimestamp. We surprisingly might have
// previously closed a timestamp above the lease start time - when we close
// timestamps in the future, then attempt to transfer our lease away (and thus
// proscribe it) but the transfer fails and we're now acquiring a new lease to
// replace the proscribed one. Also, if we ever re-introduce closed
// timestamps carried by lease requests, make sure to resurrect the old
// TestRejectedLeaseDoesntDictateClosedTimestamp and protect against that
// scenario.
if p.Request.IsLeaseRequest() {
// We read the lease from the ReplicatedEvalResult, not from leaseReq, because the
// former is more up to date, having been modified by the evaluation.
newLease := p.command.ReplicatedEvalResult.State.Lease
oldLease := p.leaseStatus.Lease
leaseExtension := newLease.Sequence == oldLease.Sequence
if leaseExtension {
return nil
}
// For brand new leases, we close the lease start time. Since this proposing
// replica is not the leaseholder, the previous target is meaningless.
closedTSTarget = newLease.Start.ToTimestamp()
// We forward closedTSTarget to b.assignedClosedTimestamp. We surprisingly
// might have previously closed a timestamp above the lease start time -
// when we close timestamps in the future, then attempt to transfer our
// lease away (and thus proscribe it) but the transfer fails and we're now
// acquiring a new lease to replace the proscribed one.
//
// TODO(andrei,nvanbenschoten): add a test with scenario:
// - Acquire lease @ 1
// - Close future timestamp @ 3
// - Attempt to transfer lease @ 2
// - Reject
// - Reacquire lease @ 2
closedTSTarget.Forward(b.assignedClosedTimestamp)

// Note that we're not bumping b.assignedClosedTimestamp here (we're not
// calling forwardClosedTimestampLocked). Bumping it to the lease start time
// would (surprisingly?) be illegal: just because we're proposing a lease
// starting at timestamp 100, doesn't mean we're sure to not be in the
// process of evaluating requests writing below 100. This can happen if a
// lease transfer has already applied while we were evaluating this lease
// request, and if we've already started evaluating writes under the
// transferred lease. Such a transfer can give us the lease starting at
// timestamp 50. If such a transfer applied, then our lease request that
// we're proposing now is sure to not apply. But if we were to bump
// b.assignedClosedTimestamp, the damage would be done. See
// TestRejectedLeaseDoesntDictateClosedTimestamp.
} else {
// Sanity check that this command is not violating the closed timestamp. It
// must be writing at a timestamp above assignedClosedTimestamp
// (assignedClosedTimestamp represents the promise that this replica made
// through previous commands to not evaluate requests with lower
// timestamps); in other words, assignedClosedTimestamp was not supposed to
// have been incremented while requests with lower timestamps were
// evaluating (instead, assignedClosedTimestamp was supposed to have bumped
// the write timestamp of any request the began evaluating after it was
// set).
if p.Request.WriteTimestamp().Less(b.assignedClosedTimestamp) && p.Request.IsIntentWrite() {
return errors.AssertionFailedf("attempting to propose command writing below closed timestamp. "+
"wts: %s < assigned closed: %s; ba: %s",
p.Request.WriteTimestamp(), b.assignedClosedTimestamp, p.Request)
}
return nil
}

lb := b.evalTracker.LowerBound(ctx)
if !lb.IsEmpty() {
// If the tracker told us that requests are currently evaluating at
// timestamps >= lb, then we can close up to lb.Prev(). We use FloorPrev()
// to get rid of the logical ticks; we try to not publish closed ts with
// logical ticks when there's no good reason for them.
closedTSTarget.Backward(lb.FloorPrev())
}
// We can't close timestamps above the current lease's expiration.
closedTSTarget.Backward(p.leaseStatus.ClosedTimestampUpperBound())

// We're about to close closedTSTarget. The propBuf needs to remember that
// in order for incoming requests to be bumped above it (through
// TrackEvaluatingRequest).
if !b.forwardClosedTimestampLocked(closedTSTarget) {
closedTSTarget = b.assignedClosedTimestamp
}
// Sanity check that this command is not violating the closed timestamp. It
// must be writing at a timestamp above assignedClosedTimestamp
// (assignedClosedTimestamp represents the promise that this replica made
// through previous commands to not evaluate requests with lower
// timestamps); in other words, assignedClosedTimestamp was not supposed to
// have been incremented while requests with lower timestamps were
// evaluating (instead, assignedClosedTimestamp was supposed to have bumped
// the write timestamp of any request the began evaluating after it was
// set).
if p.Request.WriteTimestamp().Less(b.assignedClosedTimestamp) && p.Request.IsIntentWrite() {
return errors.AssertionFailedf("attempting to propose command writing below closed timestamp. "+
"wts: %s < assigned closed: %s; ba: %s",
p.Request.WriteTimestamp(), b.assignedClosedTimestamp, p.Request)
}

lb := b.evalTracker.LowerBound(ctx)
if !lb.IsEmpty() {
// If the tracker told us that requests are currently evaluating at
// timestamps >= lb, then we can close up to lb.Prev(). We use FloorPrev()
// to get rid of the logical ticks; we try to not publish closed ts with
// logical ticks when there's no good reason for them.
closedTSTarget.Backward(lb.FloorPrev())
}
// We can't close timestamps above the current lease's expiration.
closedTSTarget.Backward(p.leaseStatus.ClosedTimestampUpperBound())

// We're about to close closedTSTarget. The propBuf needs to remember that
// in order for incoming requests to be bumped above it (through
// TrackEvaluatingRequest).
if !b.forwardClosedTimestampLocked(closedTSTarget) {
closedTSTarget = b.assignedClosedTimestamp
}

// Fill in the closed ts in the proposal.
Expand Down Expand Up @@ -862,7 +855,7 @@ func (b *propBuf) FlushLockedWithoutProposing(ctx context.Context) {
}

// OnLeaseChangeLocked is called when a new lease is applied to this range.
// assignedClosedTimestamp is the range's closed timestamp after the new lease was applied. The
// closedTS is the range's closed timestamp after the new lease was applied. The
// closed timestamp tracked by the propBuf is updated accordingly.
func (b *propBuf) OnLeaseChangeLocked(leaseOwned bool, closedTS hlc.Timestamp) {
if leaseOwned {
Expand Down
6 changes: 3 additions & 3 deletions pkg/kv/kvserver/replica_proposal_buf_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -818,7 +818,8 @@ func TestProposalBufferClosedTimestamp(t *testing.T) {
// timestamp to this expiration.
leaseExp: expiredLeaseTimestamp,
rangePolicy: roachpb.LAG_BY_CLUSTER_SETTING,
expClosed: now.ToTimestamp(),
// Lease requests don't carry closed timestamps.
expClosed: hlc.Timestamp{},
// Check that the lease proposal does not bump b.assignedClosedTimestamp.
// The proposer cannot make promises about the write timestamps of further
// requests based on the start time of a proposed lease. See comments in
Expand All @@ -838,8 +839,7 @@ func TestProposalBufferClosedTimestamp(t *testing.T) {
// timestamp to this expiration.
leaseExp: expiredLeaseTimestamp,
rangePolicy: roachpb.LAG_BY_CLUSTER_SETTING,
// Lease extensions don't carry closed timestamps because they don't get
// MLAIs, and so they can be reordered.
// Lease extensions don't carry closed timestamps.
expClosed: hlc.Timestamp{},
expAssignedClosedBumped: false,
},
Expand Down

0 comments on commit 2d9072d

Please sign in to comment.