-
Notifications
You must be signed in to change notification settings - Fork 15
Fix NPE caused by txn commit race condition #5940
Conversation
Fixes NPEs caused by race condition between multiple attempts to commit or abort a transaction.
Generate changelog in
|
changelog/@unreleased/pr-5940.v2.yml
Outdated
type: fix | ||
fix: | ||
description: Fixed a race condition where if a transaction was just committed by | ||
another thread, we may return null values from TransactionService.get. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's probably say what this means to a user: users may observe that a transaction that had definitively committed or aborted was still running. Retrying the transaction should work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally looks good, 1 comment
@@ -100,10 +101,9 @@ public void putUnlessExistsMultiple(Map<Long, Long> keyValues) throws KeyAlready | |||
+ "was found in the KVS", | |||
SafeArg.of("kvsValue", kvsValue), | |||
SafeArg.of("stagingValue", currentValue)); | |||
continue; | |||
} finally { | |||
resultBuilder.put(startTs, commitTs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit odd. If an exception is thrown, won't this appear to a caller that a value was committed, but in reality it has not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a CheckAndSetException is thrown here, it means the value was committed by another thread, and hence we must return it.
If other exceptions are thrown, then we will propagate the exception out of this method, and thus would not return the result anyway.
This is quite subtle, though, so I'll add a clarifying comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! because of the way the protocol works (ADR if you're interested), once you're able to perform a quorum read of some value V, the only permitted write operations are CAS((V, Staging), (V, Staging)) and PUT((V, Committed)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -100,10 +101,9 @@ public void putUnlessExistsMultiple(Map<Long, Long> keyValues) throws KeyAlready | |||
+ "was found in the KVS", | |||
SafeArg.of("kvsValue", kvsValue), | |||
SafeArg.of("stagingValue", currentValue)); | |||
continue; | |||
} finally { | |||
resultBuilder.put(startTs, commitTs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep! because of the way the protocol works (ADR if you're interested), once you're able to perform a quorum read of some value V, the only permitted write operations are CAS((V, Staging), (V, Staging)) and PUT((V, Committed)).
Released 0.567.0 |
Goals (and why):
==COMMIT_MSG==
Fixed a race condition where if a transaction was just committed by another thread, we may return null values from TransactionService.get.
==COMMIT_MSG==
Implementation Description (bullets):
Testing (What was existing testing like? What have you done to improve it?):
Concerns (what feedback would you like?):
Where should we start reviewing?:
Priority (whenever / two weeks / yesterday): today