Skip to content

Commit

Permalink
kvserver: reintroduce RangeDesc.GenerationComparable
Browse files Browse the repository at this point in the history
We dropped this field recently, but unfortunately that wasn't safe for
mixed-version clusters. The rub is that 20.1 nodes need to roundtrip the
proto through 20.2 nodes in a fairly subtle way. When it comes back to
the 20.1 node, the descriptor needs to compare Equal() to the original.
We configure our protos to not preserve unrecognized fields, so removing
the field breaks this round-tripping.

Specifically, the scenario which broke is the following:
1. A 20.1 node processes an AdminSplit, and performs the tranction
   writing the new descriptors. The descriptors have the
   GenerationCompable field set.
2. The lease changes while the respective txn is running. The lease
   moves to a 20.2 node.
3. The 20.2 node evaluates the EndTxn, and computes the split trigger
   info that's going to be replicated. The EndTxn has split info in it
   containing the field set, but the field is dropped when converting
   that into the proposed SplitTrigger (since the 20.2 unmarshalls and
   re-marshalls the descriptors).
4. All the 20.1 replicas of the ranges involved now apply the respective
   trigger via Raft, and their in-memory state doesn't have the field
   set. This doesn't match the bytes written in the database, which have
   the field.
5. The discrepancy between the in-memory state and the db state is a
   problem, as it causes the 20.1 node to spin if it tries to perform
   subsequent merge/split operations. The reason is that the code
   performing these operations short-circuits itself if it detects that
   the descriptor has changed while the operation was running. This
   detection is performed via the generated Equals() method, and it
   mis-fires because of the phantom field. That detection happens here:
   https://github.com/cockroachdb/cockroach/blob/79c01d28da9c379f67bb41beef3d85ad3bee1da1/pkg/kv/kvserver/replica_command.go#L1957

This patch takes precautions so that we can remove the field again in
21.1 - I'm merging this in 21.1, I'll backport it to 20.2, and then I'll
come back to 20.2 and remove the field.

Fixes cockroachdb#53535

Release note: None
  • Loading branch information
andreimatei committed Sep 10, 2020
1 parent 09fe112 commit 2240e9f
Show file tree
Hide file tree
Showing 5 changed files with 279 additions and 128 deletions.
47 changes: 38 additions & 9 deletions c-deps/libroach/protos/roachpb/metadata.pb.cc

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 34 additions & 0 deletions c-deps/libroach/protos/roachpb/metadata.pb.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 16 additions & 0 deletions pkg/kv/kvserver/replica_command.go
Original file line number Diff line number Diff line change
Expand Up @@ -1976,6 +1976,16 @@ func replicaSetsEqual(a, b []roachpb.ReplicaDescriptor) bool {
}

func checkDescsEqual(desc *roachpb.RangeDescriptor) func(*roachpb.RangeDescriptor) bool {
// Make sure the DeprecatedGenerationComparable field is ignored. 21.1 nodes
// will not recognize this field any more (and so they'll drop it), and we
// don't want 20.2 nodes to be tripped by that - we want to be able to
// round-trip a desc from 20.1 through a 20.2 node and still have it compare
// equal back on 20.1.
if desc.DeprecatedGenerationComparable != nil {
desc = protoutil.Clone(desc).(*roachpb.RangeDescriptor)
desc.DeprecatedGenerationComparable = nil
}

// TODO(jeffreyxiao): This hacky fix ensures that we don't fail the
// conditional get because of the ordering of InternalReplicas. Calling
// Replicas() will sort the list of InternalReplicas as a side-effect. The
Expand All @@ -1994,6 +2004,12 @@ func checkDescsEqual(desc *roachpb.RangeDescriptor) func(*roachpb.RangeDescripto
desc2.Replicas() // for sorting side-effect
}

// Ignore a field; see above.
if desc2.DeprecatedGenerationComparable != nil {
desc2 = protoutil.Clone(desc2).(*roachpb.RangeDescriptor)
desc2.DeprecatedGenerationComparable = nil
}

return desc.Equal(desc2)
}
}
Expand Down
Loading

0 comments on commit 2240e9f

Please sign in to comment.