rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed #104309

pand5a · 2023-06-05T01:53:01Z

I don't know if it's appropriate to send it here, but I'm really troubled.

The RangeFeed relies on the return result of PushTxns (task. go: pushOldTxns) ABORTED to remove the tracked txn in UnresolvedIntentQueue, which depends on the correctness of PushTxn.

cockroach/pkg/kv/kvserver/rangefeed/task.go

Lines 324 to 336 in 9c2f650

    
           case roachpb.ABORTED: 
        
           	// The transaction is aborted, so it doesn't need to be tracked 
        
           	// anymore nor does it need to prevent the resolved timestamp from 
        
           	// advancing. Inform the Processor that it can remove the txn from 
        
           	// its unresolvedIntentQueue. 
        
           	// 
        
           	// NOTE: the unresolvedIntentQueue will ignore MVCCAbortTxn operations 
        
           	// before it has been initialized. This is not a concern here though 
        
           	// because we never launch txnPushAttempt tasks before the queue has 
        
           	// been initialized. 
        
           	ops[i].SetValue(&enginepb.MVCCAbortTxnOp{ 
        
           		TxnID: txn.ID, 
        
           	})

But in cmd_push_txn.go The “case txnID” case of [PushTxn ->SynthesizeTxnFromMeta ->CanCreateTxnRecord(replica_tscache.go)] may return a transaction that has already been COMMITTED but has status=Abort, which may cause rts in the RangeFeed to advance incorrectly. Do I understand correctly?

cockroach/pkg/kv/kvserver/replica_tscache.go

Lines 529 to 543 in 9c2f650

    
           if txnMinTS.LessEq(tombstoneTimestamp) { 
        
           	switch tombstoneTxnID { 
        
           	case txnID: 
        
           		// If we find our own transaction ID then a transaction record has already 
        
           		// been written. We might be a replay (e.g. a DistSender retry), or we 
        
           		// raced with an asynchronous abort. Either way, return an error. 
        
           		// 
        
           		// TODO(andrei): We could keep a bit more info in the tscache to return a 
        
           		// different error for COMMITTED transactions. If the EndTxn(commit) was 
        
           		// the only request in the batch, this this would be sufficient for the 
        
           		// client to swallow the error and declare the transaction as committed. 
        
           		// If there were other requests in the EndTxn batch, then the client would 
        
           		// still have trouble reconstructing the result, but at least it could 
        
           		// provide a non-ambiguous error to the application. 
        
           		return false, kvpb.ABORT_REASON_RECORD_ALREADY_WRITTEN_POSSIBLE_REPLAY

Jira issue: CRDB-28452

Epic CRDB-27235

blathers-crl · 2023-06-05T01:53:05Z

Hello, I am Blathers. I am here to help you get the issue triaged.

It looks like you have not filled out the issue in the format of any of our templates. To best assist you, we advise you to use one of these templates.

I was unable to automatically find someone to ping.

If we have not gotten back to your issue within a few business days, you can try the following:

Join our community slack channel and ask on #cockroachdb.
Try find someone from here if you know they worked closely on the area and CC them.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

blathers-crl · 2023-06-05T14:41:38Z

cc @cockroachdb/cdc

blathers-crl · 2023-06-05T22:20:17Z

cc @cockroachdb/replication

aliher1911 · 2023-06-09T11:56:07Z

I'm looking into this. Seems like the question would be can we have pending intent in the past when we see a transaction tombstone for committed transaction that we interpreted as abort.

pand5a · 2023-06-12T01:46:22Z

Is it possible that this is the case:

It is possible that the rangefeed is built on a non majority replica, while the pushtxn in the rangefeed is sent at a fixed frequency, that is:

Leasholder and Majority commited it at ts1.
The rangefeed is built on a minority replica, and a pushtxn is issued on ts2 and a txnabort is returned. However, as the replica is asynchronous, so the commit is received on ts3 on this replica

aliher1911 · 2023-06-12T09:03:36Z

It looks like we just need a range feed running on a follower and that follower should be lagging behind on raft application. In that case it could try to push already committed transactions because it tries to compare txn timestamps from the intents it observes with wall clock (now() - threshold). Then it will drop the transaction from the list and advance its resolved timestamp.

We have a protection in rangefeed

cockroach/pkg/kv/kvserver/rangefeed/resolved_timestamp.go

Lines 265 to 273 in 9c2f650

    
           // assertOpAboveTimestamp asserts that this operation is at a larger timestamp 
        
           // than the current resolved timestamp. A violation of this assertion would 
        
           // indicate a failure of the closed timestamp mechanism. 
        
           func (rts *resolvedTimestamp) assertOpAboveRTS(op enginepb.MVCCLogicalOp, opTS hlc.Timestamp) { 
        
           	if opTS.LessEq(rts.resolvedTS) { 
        
           		panic(fmt.Sprintf("resolved timestamp %s equal to or above timestamp of operation %v", 
        
           			rts.resolvedTS, op)) 
        
           	} 
        
           }

which would cause node to panic if timestamp regression is detected to prevent any potential data inconsistency/loss on the client side.
And I can't find any evidence of panics generated by regressions like that which means it happens sufficiently infrequently.

To address it we may need to distinguish between abort types when resolving intents and in this particular case keep transaction and wait for intents to get resolved.

pand5a · 2023-06-13T02:30:39Z

"We have a protection in rangefeed"
-->

Yes, but due to MVCCAbortTxnOp, txn was removed from the list, causing rts to advance, and due to 'enginepb'. MVCCCommitteIntentOp "is not protected by assertOpAboveRTS, so this data has been discarded in the following code.

cockroach/pkg/ccl/changefeedccl/event_processing.go

Lines 401 to 411 in 9c2f650

    
           // Ensure that r updates are strictly newer than the least resolved timestamp 
        
           // being tracked by the local span frontier. The poller should not be forwarding 
        
           // r updates that have timestamps less than or equal to any resolved timestamp 
        
           // it's forwarded before. 
        
           // TODO(dan): This should be an assertion once we're confident this can never 
        
           // happen under any circumstance. 
        
           if schemaTS.LessEq(c.frontier.Frontier()) && !schemaTS.Equal(c.cursor) { 
        
           	log.Errorf(ctx, "cdc ux violation: detected timestamp %s that is less than "+ 
        
           		"or equal to the local frontier %s.", schemaTS, c.frontier.Frontier()) 
        
           	return nil 
        
           }

"To address it we may need to distinguish between abort types when resolving intents and in this particular case keep transaction and wait for intents to get resolved."
-->
Yes. But can it be done in ts_cache adding the state of txn？Is this feasible?

aliher1911 · 2023-06-15T20:13:12Z

I see what you mean. Didn't realize we skip intents when checking timestamp assertion. Even if we did, intents could arrive after checkpoint is out and even if we panic, client will not reread that data because of published checkpoint timestamp.

As for pts_cache, I don't think we need any extra info there. The problem seem to be that we know this special cause where transaction tombstone is present which should be a enough rare case, but we don't have a mechanism to surface this error to range feed processor easily without changing protocol. pts_cache can reside on any other node where transaction record was created.

Did you see this error actually happening?

pand5a · 2023-07-19T07:01:45Z

by code review, I want to use one slow node and two fast nodes, which may trigger this issue. Because from a code perspective, this issue should exist. I will try it out some time.

blathers-crl · 2023-07-19T09:25:51Z

cc @cockroachdb/replication

117968: kvserver: add `Replica.WaitForLeaseAppliedIndex()` r=erikgrinaker a=erikgrinaker Extracted from #117612. --- This allows a caller to wait for a replica to reach a certain lease applied index. Similar functionality elsewhere is not migrated yet, out of caution. Touches #104309. Epic: none Release note: None Co-authored-by: Erik Grinaker <[email protected]>

117969: batcheval: add `PushTxnResponse.AmbiguousAbort` r=erikgrinaker a=erikgrinaker Extracted from #117612. --- This indicates to the caller that the `ABORTED` status of the pushed transaction is ambiguous, and the transaction may in fact have been committed and GCed already. This information is also plumbed through the `IntentResolver` txn push APIs. Touches #104309. Epic: none Release note: None 117992: roachpb: address review comments from #117840 r=nvanbenschoten a=nvanbenschoten I had missed a `git push` immediately before merging #117840. This updates two comments. Epic: None Release note: None Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Nathan VanBenschoten <[email protected]>

117967: batcheval: add `BarrierRequest.WithLeaseAppliedIndex` r=erikgrinaker a=erikgrinaker Extracted from #117612. --- **batcheval: add `BarrierRequest.WithLeaseAppliedIndex`** This can be used to detect whether a replica has applied the barrier command yet. Touches #104309. **kvnemsis: add support for `Barrier` operations** This only executes random `Barrier` requests, but does not verify that the barrier guarantees are actually satisfied (i.e. that all past and concurrent writes are applied before it returns). At least we get some execution coverage, and verify that it does not have negative interactions with other operations. Epic: none Release note: None Co-authored-by: Erik Grinaker <[email protected]>

pand5a added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jun 5, 2023

blathers-crl bot added O-community Originated from the community X-blathers-untriaged blathers was unable to find an owner labels Jun 5, 2023

pand5a changed the title ~~Pushtxn in rangefeed returned abort, but txn may have been committed~~ changefeed: Pushtxn in rangefeed returned abort, but txn may have been committed Jun 5, 2023

yuzefovich added T-cdc and removed X-blathers-untriaged blathers was unable to find an owner labels Jun 5, 2023

blathers-crl bot added the A-cdc Change Data Capture label Jun 5, 2023

miretskiy added A-kv-replication Relating to Raft, consensus, and coordination. T-kv-replication labels Jun 5, 2023

erikgrinaker removed the T-cdc label Jun 6, 2023

erikgrinaker assigned aliher1911 Jun 6, 2023

aliher1911 removed the A-cdc Change Data Capture label Jun 6, 2023

blathers-crl bot added the T-kv KV Team label Jun 7, 2023

aliher1911 changed the title ~~changefeed: Pushtxn in rangefeed returned abort, but txn may have been committed~~ rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed Jun 12, 2023

exalate-issue-sync bot removed the T-kv-replication label Jun 13, 2023

erikgrinaker added A-kv-rangefeed Rangefeed infrastructure, server+client T-kv-replication and removed A-kv-replication Relating to Raft, consensus, and coordination. T-kv KV Team labels Jul 19, 2023

nicktrav removed the GA-blocker label Oct 23, 2023

erikgrinaker mentioned this issue Nov 15, 2023

rangefeed: fix premature checkpoint due to intent resolution race #114487

Closed

This was referenced Jan 19, 2024

batcheval: add BarrierRequest.WithLeaseAppliedIndex #117967

Merged

kvserver: add Replica.WaitForLeaseAppliedIndex() #117968

Merged

batcheval: add PushTxnResponse.AmbiguousAbort #117969

Merged

This was referenced Jan 20, 2024

rangefeed: fix premature checkpoint due to intent resolution race #117612

Merged

rangefeed: add non-fatal assertion for intent commit resolved ts #118196

Closed

craig bot closed this as completed in b6623c5 Jan 23, 2024

This was referenced Feb 2, 2024

staging-v22.2.18: release-22.2: batcheval: add PushTxnResponse.AmbiguousAbort #118631

Merged

staging-v22.2.18: release-22.2: rangefeed: fix premature checkpoint due to intent resolution race #118633

Merged

erikgrinaker mentioned this issue Feb 8, 2024

release-23.2.1-rc: rangefeed: fix premature checkpoint due to intent resolution race #118981

Merged

erikgrinaker added the C-technical-advisory Caused a technical advisory label Feb 15, 2024

erikgrinaker mentioned this issue Feb 15, 2024

release-23.1.15-rc: rangefeed: fix premature checkpoint due to intent resolution race #119270

Merged

erikgrinaker mentioned this issue Feb 22, 2024

kvserver: rangefeed txn pusher barrier may cause resolved timestamp stalls #119536

Closed

github-project-automation bot added this to KV Aug 28, 2024

github-project-automation bot moved this to Closed in KV Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed #104309

rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed #104309

pand5a commented Jun 5, 2023 •

edited by aliher1911

Loading

blathers-crl bot commented Jun 5, 2023

blathers-crl bot commented Jun 5, 2023

blathers-crl bot commented Jun 5, 2023

aliher1911 commented Jun 9, 2023

pand5a commented Jun 12, 2023 •

edited

Loading

aliher1911 commented Jun 12, 2023

pand5a commented Jun 13, 2023 •

edited

Loading

aliher1911 commented Jun 15, 2023

pand5a commented Jul 19, 2023 •

edited

Loading

blathers-crl bot commented Jul 19, 2023

rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed #104309

rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed #104309

Comments

pand5a commented Jun 5, 2023 • edited by aliher1911 Loading

blathers-crl bot commented Jun 5, 2023

blathers-crl bot commented Jun 5, 2023

blathers-crl bot commented Jun 5, 2023

aliher1911 commented Jun 9, 2023

pand5a commented Jun 12, 2023 • edited Loading

aliher1911 commented Jun 12, 2023

pand5a commented Jun 13, 2023 • edited Loading

aliher1911 commented Jun 15, 2023

pand5a commented Jul 19, 2023 • edited Loading

blathers-crl bot commented Jul 19, 2023

pand5a commented Jun 5, 2023 •

edited by aliher1911

Loading

pand5a commented Jun 12, 2023 •

edited

Loading

pand5a commented Jun 13, 2023 •

edited

Loading

pand5a commented Jul 19, 2023 •

edited

Loading