-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvcoord: keep refresh spans after savepoint rollback #111424
Conversation
The two failing tests in CI also fail reliably locally. Interestingly, they pass when I set the new cluster setting to false. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @miraradeva)
-- commits
line 22 at r1:
I feel like we should soften this language a little bit. The previous behavior was a deliberate design choice, and as we found in discussion with the team, there's not one right answer to how this should work (e.g. do these rolled-back reads need to remain consistent with the rest of the transaction?). So classifying the change here as a "bug fix" that avoids "serializability violations" feels unnecessarily alarmist. How about classifying it as a "sql change" instead, which is the closest release note category to this kind of desired behavior change?
pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go
line 50 at r1 (raw file):
// ensures that all refresh spans accumulated since a savepoint was created are // kept even after the savepoint is rolled back. This ensures that the reads // corresponding to the refresh spans are serialized correctly. See #111228 for
"serialized correctly, even though they were rolled back."
pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go
line 1526 at r1 (raw file):
// refreshed, so the conflict goes unnoticed and both txns commit successfully. // See #111228 for more details. func TestRefreshWithSavepoint(t *testing.T) {
Nice!
a0e714c
to
46342d2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not merge this until we figure out how to address the failing multi-region tests.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I feel like we should soften this language a little bit. The previous behavior was a deliberate design choice, and as we found in discussion with the team, there's not one right answer to how this should work (e.g. do these rolled-back reads need to remain consistent with the rest of the transaction?). So classifying the change here as a "bug fix" that avoids "serializability violations" feels unnecessarily alarmist. How about classifying it as a "sql change" instead, which is the closest release note category to this kind of desired behavior change?
Yes, makes sense. I wasn't sure when "sql change" is usually used. I also added some more to the main body of the commit message, but let me know if I should tweak the release note some more as well.
pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go
line 1526 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Nice!
Let me know if you don't think this test belongs here. I have a hard time finding the right place for these integration tests.
46342d2
to
3504868
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @miraradeva)
pkg/kv/kvpb/errors.go
line 869 at r2 (raw file):
} if e.ConflictingTxn != nil { msg = redact.Sprintf(" %s - conflicting txn: meta={%s}", msg, e.ConflictingTxn.String())
Did you mean to change this line?
pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go
line 1526 at r1 (raw file):
Previously, miraradeva (Mira Radeva) wrote…
Let me know if you don't think this test belongs here. I have a hard time finding the right place for these integration tests.
I think it should probably go in txn_coord_sender_test.go
because it's an integration test and not a unit test. That will mean that you'll have to export keepRefreshSpansOnSavepointRollback
.
3504868
to
1e7bdc2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)
pkg/kv/kvpb/errors.go
line 869 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Did you mean to change this line?
Yes, I changed it because it added an extra space in the error message: (RETRY_SERIALIZABLE<space><space>- failed preemptive refresh
.
Previously, when a read occurred between SAVEPOINT and ROLLBACK TO SAVEPOINT, upon rollback the read was removed from the transaction's refresh spans. If the transaction's read and write timestamps diverged, this read would not be refreshed, which could lead to a serializability violation. Moreover, this behavior diverged from the Postgres behavior, which considers reads in a subtransaction to belong to the parent transaction (https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/README-SSI#L461-L467). The previous behavior was an intentional design choice, which made sense at the time with the intention of using savepoints to recover from serialization errors; however, this was never implemented. After some discussions, we have decided to match the Postgres behavior instead. This patch adresses the issue by ensuring that all refresh spans accumulated since a savepoint was created are kept after the savepoint is rolled back. We don't expect this new behavior to impact customers because they should already be able to handle serialization errors; in case any unforeseen customer issues arise, this patch also includes a private cluster setting to revert to the old behavior. Fixes: cockroachdb#111228 Release note (sql change): Reads rolled back by savepoints are now refreshable, matching the Postgres behavior and avoiding potential serializability violations.
1e7bdc2
to
66c74aa
Compare
bors r=nvanbenschoten |
Build succeeded: |
Previously, when a read occurred between SAVEPOINT and ROLLBACK TO
SAVEPOINT, upon rollback the read was removed from the transaction's
refresh spans. If the transaction's read and write timestamps diverged,
this read would not be refreshed, which could lead to a serializability
violation. Moreover, this behavior diverged from the Postgres behavior,
which considers reads in a subtransaction to belong to the parent
transaction
(https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/README-SSI#L461-L467).
The previous behavior was an intentional design choice, which made sense
at the time with the intention of using savepoints to recover from
serialization errors; however, this was never implemented. After some
discussions, we have decided to match the Postgres behavior instead.
This patch adresses the issue by ensuring that all refresh spans
accumulated since a savepoint was created are kept after the savepoint
is rolled back. We don't expect this new behavior to impact customers
because they should already be able to handle serialization errors; in
case any unforeseen customer issues arise, this patch also includes a
private cluster setting to revert to the old behavior.
Fixes: #111228
Release note (sql change): Reads rolled back by savepoints are now
refreshable, matching the Postgres behavior and avoiding potential
serializability violations.