Skip to content

Commit

Permalink
kvserver: don't allow raft forwarding of lease requests
Browse files Browse the repository at this point in the history
This patch aims to improve the behavior in scenarios where a follower
replica is behind, unaware of the latest lease, and it tries to acquire
a lease in its ignorant state. That lease acquisition request is bound
to fail (because the lease that it's based on is stale), but while it
fails (or, rather, until the behind replica finds out that it failed)
local requests are blocked. This blocking can last for a very long time
in situations where a snapshot is needed to catch up the follower, and
the snapshot is queued up behind many other snapshots (e.g. after a node
has been down for a while and gets restarted).

This patch tries an opinionated solution: never allow followers to
acquire leases. If there is a leader, it's a better idea for the leader
to acquire the lease. The leader might have a lease anyway or, even if
it doesn't, having the leader acquire it saves a leadership transfer
(leadership follows the lease).
We change the proposal path to recognize lease requests and reject them
early if the current replica is a follower and the leader is known. The
rejection points to the leader, which causes the request that triggered
the lease acquisition to make its way to the leader and attempt to
acquire a lease over there.

Fixes #37906

As described in #37906, the badness caused by requests blocking behind a
doomed lease acq request could be reproduced with a 100-warehouse tpcc
workload (--no-wait) on a 3 node cluster. Turning off a node for 10
minutes and then turning it back on would result in the cluster being
wedged for a few minutes until all the snapshots are transferred. I've
verified that this patch fixes it.

Release note (bug fix): A bug causing queries sent to a
freshly-restarted node to sometimes hang for a long time while the node
catches up with replication has been fixed.
  • Loading branch information
andreimatei committed Nov 13, 2020
1 parent 5e1a8c3 commit 8aa1c14
Show file tree
Hide file tree
Showing 11 changed files with 559 additions and 52 deletions.
13 changes: 11 additions & 2 deletions pkg/kv/kvserver/client_raft_helpers_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ package kvserver_test

import (
"context"
"fmt"

"github.com/cockroachdb/cockroach/pkg/kv/kvserver"
"github.com/cockroachdb/cockroach/pkg/roachpb"
Expand All @@ -22,7 +23,9 @@ import (
)

type unreliableRaftHandlerFuncs struct {
// If non-nil, can return false to avoid dropping a msg to rangeID.
// If non-nil, can return false to avoid dropping the msg to
// unreliableRaftHandler.rangeID. If nil, all messages pertaining to the
// respective range are dropped.
dropReq func(*kvserver.RaftMessageRequest) bool
dropHB func(*kvserver.RaftHeartbeat) bool
dropResp func(*kvserver.RaftMessageResponse) bool
Expand All @@ -47,6 +50,7 @@ func noopRaftHandlerFuncs() unreliableRaftHandlerFuncs {
// unreliableRaftHandler drops all Raft messages that are addressed to the
// specified rangeID, but lets all other messages through.
type unreliableRaftHandler struct {
name string
rangeID roachpb.RangeID
kvserver.RaftMessageHandler
unreliableRaftHandlerFuncs
Expand All @@ -68,9 +72,14 @@ func (h *unreliableRaftHandler) HandleRaftRequest(
}
} else if req.RangeID == h.rangeID {
if h.dropReq == nil || h.dropReq(req) {
var prefix string
if h.name != "" {
prefix = fmt.Sprintf("[%s] ", h.name)
}
log.Infof(
ctx,
"dropping r%d Raft message %s",
"%sdropping r%d Raft message %s",
prefix,
req.RangeID,
raft.DescribeMessage(req.Message, func([]byte) string {
return "<omitted>"
Expand Down
Loading

0 comments on commit 8aa1c14

Please sign in to comment.