release-21.1: kvcoord: prevent concurrent EndTxn requests #65863
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #65592.
This should bake on master for a while, in case there are unexpected async requests being sent.
/cc @cockroachdb/release @cockroachdb/kv
TxnCoordSender
generally operates synchronously (i.e. the client waitsfor the previous response before sending the next request). However, the
txnHeartbeater
sends asynchronousEndTxn(commit=false)
rollbackswhen it discovers an aborted transaction record. Unfortunately, some code
assumes synchrony, which caused race conditions with txn rollbacks.
In particular, the
txnPipeliner
attaches lock spans and in-flightwrites to the
EndTxn
request for e.g. intent cleanup, but it onlyrecords this information when it receives responses. Thus, if an
EndTxn(commit=false)
is sent concurrently with a write request, thelock spans and in-flight writes of that write request will not get
attached to the
EndTxn
request and the intents will not get cleanedup.
This patch makes the
txnHeartbeater
wait for any in-flight requests tocomplete before sending asynchronous rollbacks, and collapses incoming
client rollbacks with in-flight async rollbacks.
Resolves #65458.
Resolves #65587.
Resolves #65447.
Release note (bug fix): Fixed a race condition where transaction cleanup
would fail to take into account ongoing writes and clean up their
intents.