Skip to content

Commit

Permalink
kvserver: reorder leasePostApply on snapshot application
Browse files Browse the repository at this point in the history
The snapshot application code was weird - it called
leasePostApplyLocked() *before* doing r.mu.state = snapshot.state. This
is different from the way it's called when a lease applies regularly
through a raft command, where most of the command's effects on the
replica state have already been applied (hence the "Post" in the name).
This patch reorders things on snapshot application such that r.mu.state
is updated before the leasePostApplyLocked call.

Besides being sane, this comes in anticipation of leasePostApplyLocked
using more elements of the replica state, and needing those to be up to
date.

Release note: None
  • Loading branch information
andreimatei committed Feb 18, 2021
1 parent 9c39719 commit 4a22463
Show file tree
Hide file tree
Showing 4 changed files with 38 additions and 26 deletions.
5 changes: 4 additions & 1 deletion pkg/kv/kvserver/replica_application_result.go
Original file line number Diff line number Diff line change
Expand Up @@ -260,7 +260,10 @@ func (r *Replica) handleDescResult(ctx context.Context, desc *roachpb.RangeDescr
func (r *Replica) handleLeaseResult(ctx context.Context, lease *roachpb.Lease) {
r.mu.Lock()
defer r.mu.Unlock()
r.leasePostApplyLocked(ctx, *lease, assertNoLeaseJump)
r.leasePostApplyLocked(ctx,
r.mu.state.Lease, /* prevLease */
lease, /* newLease */
assertNoLeaseJump)
}

func (r *Replica) handleTruncatedStateResult(
Expand Down
35 changes: 20 additions & 15 deletions pkg/kv/kvserver/replica_proposal.go
Original file line number Diff line number Diff line change
Expand Up @@ -308,20 +308,25 @@ const (
// leasePostApplyLocked updates the Replica's internal state to reflect the
// application of a new Range lease. The method is idempotent, so it can be
// called repeatedly for the same lease safely. However, the method will panic
// if passed a lease with a lower sequence number than the current lease.
// Depending on jumpOpt, we'll also panic if passed a lease that indicates a
// forward sequence number jump (i.e. a skipped lease).
// if newLease has a lower sequence number than the current lease. Depending on
// jumpOpt, we'll also panic if newLease indicates a forward sequence number
// jump compared to prevLease (i.e. a skipped lease).
//
// prevLease represents the most recent lease this replica was aware of before
// newLease came along. This is usually (but not necessarily) the latest lease
// ever applied to the range. However, there's also the case when the replica
// found out about newLease through a snapshot; in this case the replica might
// not be aware of other lease changes that happened before the snapshot was
// generated. This method thus tolerates prevLease being "stale" when
// allowLeaseJump is passed. prevLease can also be the same as newLease; see below.
//
// newLease represents the lease being applied. Can be the same as prevLease.
// This allows leasePostApplyLocked to be called for some of its side-effects
// even if the lease in question has otherwise already been applied to the
// range.
func (r *Replica) leasePostApplyLocked(
ctx context.Context, newLease roachpb.Lease, jumpOpt leaseJumpOption,
ctx context.Context, prevLease *roachpb.Lease, newLease *roachpb.Lease, jumpOpt leaseJumpOption,
) {
// Pull out the last lease known to this Replica. It's possible that this is
// not actually the last lease in the Range's lease sequence because the
// Replica may have missed the application of a lease between prevLease and
// newLease. However, this should only be possible if a snapshot includes a
// lease update. All other forms of lease updates should be continuous
// without jumps (see permitJump).
prevLease := *r.mu.state.Lease

// Sanity check to make sure that the lease sequence is moving in the right
// direction.
if s1, s2 := prevLease.Sequence, newLease.Sequence; s1 != 0 {
Expand All @@ -335,7 +340,7 @@ func (r *Replica) leasePostApplyLocked(
// the same lease. This can happen when callers are using
// leasePostApply for some of its side effects, like with
// splitPostApply. It can also happen during lease extensions.
if !prevLease.Equivalent(newLease) {
if !prevLease.Equivalent(*newLease) {
log.Fatalf(ctx, "sequence identical for different leases, prevLease=%s, newLease=%s",
log.Safe(prevLease), log.Safe(newLease))
}
Expand Down Expand Up @@ -412,7 +417,7 @@ func (r *Replica) leasePostApplyLocked(
// ordering were reversed, it would be possible for requests to see the new
// lease but not the updated merge or timestamp cache state, which can result
// in serializability violations.
r.mu.state.Lease = &newLease
r.mu.state.Lease = newLease
expirationBasedLease := r.requiresExpiringLeaseRLocked()

// Gossip the first range whenever its lease is acquired. We check to make
Expand Down Expand Up @@ -485,7 +490,7 @@ func (r *Replica) leasePostApplyLocked(

// Mark the new lease in the replica's lease history.
if r.leaseHistory != nil {
r.leaseHistory.add(newLease)
r.leaseHistory.add(*newLease)
}
}

Expand Down
19 changes: 10 additions & 9 deletions pkg/kv/kvserver/replica_raftstorage.go
Original file line number Diff line number Diff line change
Expand Up @@ -983,15 +983,6 @@ func (r *Replica) applySnapshot(
// without risking a lock-ordering deadlock.
r.store.mu.Unlock()

// Invoke the leasePostApply method to ensure we properly initialize the
// replica according to whether it holds the lease. We allow jumps in the
// lease sequence because there may be multiple lease changes accounted for
// in the snapshot.
r.leasePostApplyLocked(ctx, *s.Lease, allowLeaseJump)

// Inform the concurrency manager that this replica just applied a snapshot.
r.concMgr.OnReplicaSnapshotApplied()

// We set the persisted last index to the last applied index. This is
// not a correctness issue, but means that we may have just transferred
// some entries we're about to re-request from the leader and overwrite.
Expand All @@ -1005,6 +996,7 @@ func (r *Replica) applySnapshot(
// Update the store stats for the data in the snapshot.
r.store.metrics.subtractMVCCStats(ctx, r.mu.tenantID, *r.mu.state.Stats)
r.store.metrics.addMVCCStats(ctx, r.mu.tenantID, *s.Stats)
lastKnownLease := r.mu.state.Lease
// Update the rest of the Raft state. Changes to r.mu.state.Desc must be
// managed by r.setDescRaftMuLocked and changes to r.mu.state.Lease must be handled
// by r.leasePostApply, but we called those above, so now it's safe to
Expand All @@ -1013,6 +1005,15 @@ func (r *Replica) applySnapshot(
// Snapshots typically have fewer log entries than the leaseholder. The next
// time we hold the lease, recompute the log size before making decisions.
r.mu.raftLogSizeTrusted = false

// Invoke the leasePostApply method to ensure we properly initialize the
// replica according to whether it holds the lease. We allow jumps in the
// lease sequence because there may be multiple lease changes accounted for
// in the snapshot.
r.leasePostApplyLocked(ctx, lastKnownLease, s.Lease /* newLease */, allowLeaseJump)
// Inform the concurrency manager that this replica just applied a snapshot.
r.concMgr.OnReplicaSnapshotApplied()

r.mu.Unlock()

// Assert that the in-memory and on-disk states of the Replica are congruent
Expand Down
5 changes: 4 additions & 1 deletion pkg/kv/kvserver/store_split.go
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,10 @@ func prepareRightReplicaForSplit(
// Invoke the leasePostApplyLocked method to ensure we properly initialize
// the replica according to whether it holds the lease. This enables the
// txnWaitQueue.
rightRepl.leasePostApplyLocked(ctx, *rightRepl.mu.state.Lease, assertNoLeaseJump)
rightRepl.leasePostApplyLocked(ctx,
rightRepl.mu.state.Lease, /* prevLease */
rightRepl.mu.state.Lease, /* newLease - same as prevLease */
assertNoLeaseJump)

// We need to explicitly wake up the Raft group on the right-hand range or
// else the range could be underreplicated for an indefinite period of time.
Expand Down

0 comments on commit 4a22463

Please sign in to comment.