release-20.2: kvserver: reintroduce RangeDesc.GenerationComparable #54469
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #54199.
/cc @cockroachdb/release
We dropped this field recently, but unfortunately that wasn't safe for
mixed-version clusters. The rub is that 20.1 nodes need to roundtrip the
proto through 20.2 nodes in a fairly subtle way. When it comes back to
the 20.1 node, the descriptor needs to compare Equal() to the original.
We configure our protos to not preserve unrecognized fields, so removing
the field breaks this round-tripping.
Specifically, the scenario which broke is the following:
writing the new descriptors. The descriptors have the
GenerationCompable field set.
moves to a 20.2 node.
info that's going to be replicated. The EndTxn has split info in it
containing the field set, but the field is dropped when converting
that into the proposed SplitTrigger (since the 20.2 unmarshalls and
re-marshalls the descriptors).
trigger via Raft, and their in-memory state doesn't have the field
set. This doesn't match the bytes written in the database, which have
the field.
problem, as it causes the 20.1 node to spin if it tries to perform
subsequent merge/split operations. The reason is that the code
performing these operations short-circuits itself if it detects that
the descriptor has changed while the operation was running. This
detection is performed via the generated Equals() method, and it
mis-fires because of the phantom field. That detection happens here:
cockroach/pkg/kv/kvserver/replica_command.go
Line 1957 in 79c01d2
This patch takes precautions so that we can remove the field again in
21.1 - I'm merging this in 21.1, I'll backport it to 20.2, and then I'll
come back to 20.2 and remove the field.
Fixes #53535
Release note: None