Skip to content

Commit

Permalink
Merge #55148 #56392
Browse files Browse the repository at this point in the history
55148: kvserver: don't allow raft forwarding of lease requests r=andreimatei a=andreimatei

This patch aims to improve the behavior in scenarios where a follower
replica is behind, unaware of the latest lease, and it tries to acquire
a lease in its ignorant state. That lease acquisition request is bound
to fail (because the lease that it's based on is stale), but while it
fails (or, rather, until the behind replica finds out that it failed)
local requests are blocked. This blocking can last for a very long time
in situations where a snapshot is needed to catch up the follower, and
the snapshot is queued up behind many other snapshots (e.g. after a node
has been down for a while and gets restarted).

This patch tries an opinionated solution: never allow followers to
acquire leases. If there is a leader, it's a better idea for the leader
to acquire the lease. The leader might have a lease anyway or, even if
it doesn't, having the leader acquire it saves a leadership transfer
(leadership follows the lease).
We change the proposal path to recognize lease requests and reject them
early if the current replica is a follower and the leader is known. The
rejection points to the leader, which causes the request that triggered
the lease acquisition to make its way to the leader and attempt to
acquire a lease over there.

The next commit takes this further by short-circuiting the lease
proposal even sooner - but that patch is more best-effort.

Fixes #37906

Release note (bug fix): A bug causing queries sent to a
freshly-restarted node to sometimes hang for a long time while the node
catches up with replication has been fixed.

56392: kvserver/observedts: extract package to house logic and docs for observed timestamps r=nvanbenschoten a=nvanbenschoten

This commit introduces a new package that contains logic and documentation related to the observed timestamp system, which allows transactions to track causality between themselves and other, possibly-concurrent, transactions in order to avoid uncertainty related restarts.

This ties into our efforts to better document the kv layer.

It also ties into synthetic timestamps (#56373), which are important for non-blocking transactions (#52745) and for fixing the interaction between observed timestamps and transaction refreshing (#36431). The proposed fix is to split out a new EffectiveMaxTimestamp field from the MaxTimestamp field on a transaction and set the EffectiveMaxTimestamp in LimitTxnMaxTimestamp. The EffectiveMaxTimestamp field will dictate the uncertainty interval for values with normal timestamps while the original MaxTimestamp will dictate the uncertainty interval for values with synthetic timestamps. In practice, this will mean that observed timestamps do not apply to values with synthetic timestamps. This commit gives us a package to house this new complexity.

Co-authored-by: Andrei Matei <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
  • Loading branch information
3 people committed Nov 13, 2020
3 parents 707c608 + 8aa1c14 + 58e4c7b commit 0a958fd
Show file tree
Hide file tree
Showing 20 changed files with 884 additions and 360 deletions.
1 change: 1 addition & 0 deletions pkg/kv/kvserver/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,7 @@ go_library(
"//pkg/kv/kvserver/kvserverpb",
"//pkg/kv/kvserver/liveness",
"//pkg/kv/kvserver/liveness/livenesspb",
"//pkg/kv/kvserver/observedts",
"//pkg/kv/kvserver/protectedts",
"//pkg/kv/kvserver/protectedts/ptpb",
"//pkg/kv/kvserver/raftentry",
Expand Down
13 changes: 11 additions & 2 deletions pkg/kv/kvserver/client_raft_helpers_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ package kvserver_test

import (
"context"
"fmt"

"github.com/cockroachdb/cockroach/pkg/kv/kvserver"
"github.com/cockroachdb/cockroach/pkg/roachpb"
Expand All @@ -22,7 +23,9 @@ import (
)

type unreliableRaftHandlerFuncs struct {
// If non-nil, can return false to avoid dropping a msg to rangeID.
// If non-nil, can return false to avoid dropping the msg to
// unreliableRaftHandler.rangeID. If nil, all messages pertaining to the
// respective range are dropped.
dropReq func(*kvserver.RaftMessageRequest) bool
dropHB func(*kvserver.RaftHeartbeat) bool
dropResp func(*kvserver.RaftMessageResponse) bool
Expand All @@ -47,6 +50,7 @@ func noopRaftHandlerFuncs() unreliableRaftHandlerFuncs {
// unreliableRaftHandler drops all Raft messages that are addressed to the
// specified rangeID, but lets all other messages through.
type unreliableRaftHandler struct {
name string
rangeID roachpb.RangeID
kvserver.RaftMessageHandler
unreliableRaftHandlerFuncs
Expand All @@ -68,9 +72,14 @@ func (h *unreliableRaftHandler) HandleRaftRequest(
}
} else if req.RangeID == h.rangeID {
if h.dropReq == nil || h.dropReq(req) {
var prefix string
if h.name != "" {
prefix = fmt.Sprintf("[%s] ", h.name)
}
log.Infof(
ctx,
"dropping r%d Raft message %s",
"%sdropping r%d Raft message %s",
prefix,
req.RangeID,
raft.DescribeMessage(req.Message, func([]byte) string {
return "<omitted>"
Expand Down
Loading

0 comments on commit 0a958fd

Please sign in to comment.