Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed #104309

Closed
pand5a opened this issue Jun 5, 2023 · 12 comments · Fixed by #117612
Closed

rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed #104309

pand5a opened this issue Jun 5, 2023 · 12 comments · Fixed by #117612
Assignees
Labels
A-kv-rangefeed Rangefeed infrastructure, server+client branch-master Failures and bugs on the master branch. branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-technical-advisory Caused a technical advisory O-community Originated from the community

Comments

@pand5a
Copy link

pand5a commented Jun 5, 2023

I don't know if it's appropriate to send it here, but I'm really troubled.

The RangeFeed relies on the return result of PushTxns (task. go: pushOldTxns) ABORTED to remove the tracked txn in UnresolvedIntentQueue, which depends on the correctness of PushTxn.

case roachpb.ABORTED:
// The transaction is aborted, so it doesn't need to be tracked
// anymore nor does it need to prevent the resolved timestamp from
// advancing. Inform the Processor that it can remove the txn from
// its unresolvedIntentQueue.
//
// NOTE: the unresolvedIntentQueue will ignore MVCCAbortTxn operations
// before it has been initialized. This is not a concern here though
// because we never launch txnPushAttempt tasks before the queue has
// been initialized.
ops[i].SetValue(&enginepb.MVCCAbortTxnOp{
TxnID: txn.ID,
})

But in cmd_push_txn.go The “case txnID” case of [PushTxn ->SynthesizeTxnFromMeta ->CanCreateTxnRecord(replica_tscache.go)] may return a transaction that has already been COMMITTED but has status=Abort, which may cause rts in the RangeFeed to advance incorrectly. Do I understand correctly?

if txnMinTS.LessEq(tombstoneTimestamp) {
switch tombstoneTxnID {
case txnID:
// If we find our own transaction ID then a transaction record has already
// been written. We might be a replay (e.g. a DistSender retry), or we
// raced with an asynchronous abort. Either way, return an error.
//
// TODO(andrei): We could keep a bit more info in the tscache to return a
// different error for COMMITTED transactions. If the EndTxn(commit) was
// the only request in the batch, this this would be sufficient for the
// client to swallow the error and declare the transaction as committed.
// If there were other requests in the EndTxn batch, then the client would
// still have trouble reconstructing the result, but at least it could
// provide a non-ambiguous error to the application.
return false, kvpb.ABORT_REASON_RECORD_ALREADY_WRITTEN_POSSIBLE_REPLAY

Jira issue: CRDB-28452

Epic CRDB-27235

@pand5a pand5a added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jun 5, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jun 5, 2023

Hello, I am Blathers. I am here to help you get the issue triaged.

It looks like you have not filled out the issue in the format of any of our templates. To best assist you, we advise you to use one of these templates.

I was unable to automatically find someone to ping.

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-untriaged blathers was unable to find an owner labels Jun 5, 2023
@pand5a pand5a changed the title Pushtxn in rangefeed returned abort, but txn may have been committed changefeed: Pushtxn in rangefeed returned abort, but txn may have been committed Jun 5, 2023
@yuzefovich yuzefovich added T-cdc and removed X-blathers-untriaged blathers was unable to find an owner labels Jun 5, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jun 5, 2023

cc @cockroachdb/cdc

@blathers-crl blathers-crl bot added the A-cdc Change Data Capture label Jun 5, 2023
@miretskiy miretskiy added A-kv-replication Relating to Raft, consensus, and coordination. T-kv-replication labels Jun 5, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jun 5, 2023

cc @cockroachdb/replication

@erikgrinaker erikgrinaker removed the T-cdc label Jun 6, 2023
@aliher1911 aliher1911 removed the A-cdc Change Data Capture label Jun 6, 2023
@blathers-crl blathers-crl bot added the T-kv KV Team label Jun 7, 2023
@aliher1911
Copy link
Contributor

I'm looking into this. Seems like the question would be can we have pending intent in the past when we see a transaction tombstone for committed transaction that we interpreted as abort.

@pand5a
Copy link
Author

pand5a commented Jun 12, 2023

Is it possible that this is the case:

It is possible that the rangefeed is built on a non majority replica, while the pushtxn in the rangefeed is sent at a fixed frequency, that is:

  1. Leasholder and Majority commited it at ts1.

  2. The rangefeed is built on a minority replica, and a pushtxn is issued on ts2 and a txnabort is returned. However, as the replica is asynchronous, so the commit is received on ts3 on this replica

@aliher1911
Copy link
Contributor

It looks like we just need a range feed running on a follower and that follower should be lagging behind on raft application. In that case it could try to push already committed transactions because it tries to compare txn timestamps from the intents it observes with wall clock (now() - threshold). Then it will drop the transaction from the list and advance its resolved timestamp.

We have a protection in rangefeed

// assertOpAboveTimestamp asserts that this operation is at a larger timestamp
// than the current resolved timestamp. A violation of this assertion would
// indicate a failure of the closed timestamp mechanism.
func (rts *resolvedTimestamp) assertOpAboveRTS(op enginepb.MVCCLogicalOp, opTS hlc.Timestamp) {
if opTS.LessEq(rts.resolvedTS) {
panic(fmt.Sprintf("resolved timestamp %s equal to or above timestamp of operation %v",
rts.resolvedTS, op))
}
}

which would cause node to panic if timestamp regression is detected to prevent any potential data inconsistency/loss on the client side.
And I can't find any evidence of panics generated by regressions like that which means it happens sufficiently infrequently.

To address it we may need to distinguish between abort types when resolving intents and in this particular case keep transaction and wait for intents to get resolved.

@aliher1911 aliher1911 changed the title changefeed: Pushtxn in rangefeed returned abort, but txn may have been committed rangefeed: Pushtxn in rangefeed returned abort, but txn may have been committed Jun 12, 2023
@pand5a
Copy link
Author

pand5a commented Jun 13, 2023

"We have a protection in rangefeed"
-->

Yes, but due to MVCCAbortTxnOp, txn was removed from the list, causing rts to advance, and due to 'enginepb'. MVCCCommitteIntentOp "is not protected by assertOpAboveRTS, so this data has been discarded in the following code.

// Ensure that r updates are strictly newer than the least resolved timestamp
// being tracked by the local span frontier. The poller should not be forwarding
// r updates that have timestamps less than or equal to any resolved timestamp
// it's forwarded before.
// TODO(dan): This should be an assertion once we're confident this can never
// happen under any circumstance.
if schemaTS.LessEq(c.frontier.Frontier()) && !schemaTS.Equal(c.cursor) {
log.Errorf(ctx, "cdc ux violation: detected timestamp %s that is less than "+
"or equal to the local frontier %s.", schemaTS, c.frontier.Frontier())
return nil
}

"To address it we may need to distinguish between abort types when resolving intents and in this particular case keep transaction and wait for intents to get resolved."
-->
Yes. But can it be done in ts_cache adding the state of txn?Is this feasible?

@aliher1911
Copy link
Contributor

I see what you mean. Didn't realize we skip intents when checking timestamp assertion. Even if we did, intents could arrive after checkpoint is out and even if we panic, client will not reread that data because of published checkpoint timestamp.

As for pts_cache, I don't think we need any extra info there. The problem seem to be that we know this special cause where transaction tombstone is present which should be a enough rare case, but we don't have a mechanism to surface this error to range feed processor easily without changing protocol. pts_cache can reside on any other node where transaction record was created.

Did you see this error actually happening?

@pand5a
Copy link
Author

pand5a commented Jul 19, 2023

by code review, I want to use one slow node and two fast nodes, which may trigger this issue. Because from a code perspective, this issue should exist. I will try it out some time.

@erikgrinaker erikgrinaker added A-kv-rangefeed Rangefeed infrastructure, server+client T-kv-replication and removed A-kv-replication Relating to Raft, consensus, and coordination. T-kv KV Team labels Jul 19, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jul 19, 2023

cc @cockroachdb/replication

craig bot pushed a commit that referenced this issue Jan 20, 2024
117968: kvserver: add `Replica.WaitForLeaseAppliedIndex()` r=erikgrinaker a=erikgrinaker

Extracted from #117612.

---

This allows a caller to wait for a replica to reach a certain lease applied index. Similar functionality elsewhere is not migrated yet, out of caution.

Touches #104309.
Epic: none
Release note: None

Co-authored-by: Erik Grinaker <[email protected]>
craig bot pushed a commit that referenced this issue Jan 20, 2024
117969: batcheval: add `PushTxnResponse.AmbiguousAbort` r=erikgrinaker a=erikgrinaker

Extracted from #117612.

---

This indicates to the caller that the `ABORTED` status of the pushed transaction is ambiguous, and the transaction may in fact have been committed and GCed already. This information is also plumbed through the `IntentResolver` txn push APIs.

Touches #104309.
Epic: none
Release note: None

117992: roachpb: address review comments from #117840 r=nvanbenschoten a=nvanbenschoten

I had missed a `git push` immediately before merging #117840. This updates two comments.

Epic: None
Release note: None

Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
craig bot pushed a commit that referenced this issue Jan 20, 2024
117967: batcheval: add `BarrierRequest.WithLeaseAppliedIndex` r=erikgrinaker a=erikgrinaker

Extracted from #117612.

---

**batcheval: add `BarrierRequest.WithLeaseAppliedIndex`**

This can be used to detect whether a replica has applied the barrier command yet.

Touches #104309.

**kvnemsis: add support for `Barrier` operations**

This only executes random `Barrier` requests, but does not verify that the barrier guarantees are actually satisfied (i.e. that all past and concurrent writes are applied before it returns). At least we get some execution coverage, and verify that it does not have negative interactions with other operations.

Epic: none
Release note: None

Co-authored-by: Erik Grinaker <[email protected]>
@craig craig bot closed this as completed in b6623c5 Jan 23, 2024
@erikgrinaker erikgrinaker added the C-technical-advisory Caused a technical advisory label Feb 15, 2024
@shralex shralex added branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 labels Feb 21, 2024
@github-project-automation github-project-automation bot moved this to Closed in KV Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-rangefeed Rangefeed infrastructure, server+client branch-master Failures and bugs on the master branch. branch-release-22.2 Used to mark GA and release blockers, technical advisories, and bugs for 22.2 branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. C-technical-advisory Caused a technical advisory O-community Originated from the community
Projects
No open projects
Status: Closed
7 participants