Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvcoord: keep refresh spans after savepoint rollback #111424

Merged
merged 1 commit into from
Nov 28, 2023

Conversation

miraradeva
Copy link
Contributor

@miraradeva miraradeva commented Sep 28, 2023

Previously, when a read occurred between SAVEPOINT and ROLLBACK TO
SAVEPOINT, upon rollback the read was removed from the transaction's
refresh spans. If the transaction's read and write timestamps diverged,
this read would not be refreshed, which could lead to a serializability
violation. Moreover, this behavior diverged from the Postgres behavior,
which considers reads in a subtransaction to belong to the parent
transaction
(https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/README-SSI#L461-L467).

The previous behavior was an intentional design choice, which made sense
at the time with the intention of using savepoints to recover from
serialization errors; however, this was never implemented. After some
discussions, we have decided to match the Postgres behavior instead.

This patch adresses the issue by ensuring that all refresh spans
accumulated since a savepoint was created are kept after the savepoint
is rolled back. We don't expect this new behavior to impact customers
because they should already be able to handle serialization errors; in
case any unforeseen customer issues arise, this patch also includes a
private cluster setting to revert to the old behavior.

Fixes: #111228

Release note (sql change): Reads rolled back by savepoints are now
refreshable, matching the Postgres behavior and avoiding potential
serializability violations.

@miraradeva miraradeva requested a review from a team as a code owner September 28, 2023 15:58
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@miraradeva
Copy link
Contributor Author

The two failing tests in CI also fail reliably locally. Interestingly, they pass when I set the new cluster setting to false.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @miraradeva)


-- commits line 22 at r1:
I feel like we should soften this language a little bit. The previous behavior was a deliberate design choice, and as we found in discussion with the team, there's not one right answer to how this should work (e.g. do these rolled-back reads need to remain consistent with the rest of the transaction?). So classifying the change here as a "bug fix" that avoids "serializability violations" feels unnecessarily alarmist. How about classifying it as a "sql change" instead, which is the closest release note category to this kind of desired behavior change?


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go line 50 at r1 (raw file):

// ensures that all refresh spans accumulated since a savepoint was created are
// kept even after the savepoint is rolled back. This ensures that the reads
// corresponding to the refresh spans are serialized correctly. See #111228 for

"serialized correctly, even though they were rolled back."


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go line 1526 at r1 (raw file):

// refreshed, so the conflict goes unnoticed and both txns commit successfully.
// See #111228 for more details.
func TestRefreshWithSavepoint(t *testing.T) {

Nice!

Copy link
Contributor Author

@miraradeva miraradeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not merge this until we figure out how to address the failing multi-region tests.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)


-- commits line 22 at r1:

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I feel like we should soften this language a little bit. The previous behavior was a deliberate design choice, and as we found in discussion with the team, there's not one right answer to how this should work (e.g. do these rolled-back reads need to remain consistent with the rest of the transaction?). So classifying the change here as a "bug fix" that avoids "serializability violations" feels unnecessarily alarmist. How about classifying it as a "sql change" instead, which is the closest release note category to this kind of desired behavior change?

Yes, makes sense. I wasn't sure when "sql change" is usually used. I also added some more to the main body of the commit message, but let me know if I should tweak the release note some more as well.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go line 1526 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Nice!

Let me know if you don't think this test belongs here. I have a hard time finding the right place for these integration tests.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @miraradeva)


pkg/kv/kvpb/errors.go line 869 at r2 (raw file):

	}
	if e.ConflictingTxn != nil {
		msg = redact.Sprintf(" %s - conflicting txn: meta={%s}", msg, e.ConflictingTxn.String())

Did you mean to change this line?


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher_test.go line 1526 at r1 (raw file):

Previously, miraradeva (Mira Radeva) wrote…

Let me know if you don't think this test belongs here. I have a hard time finding the right place for these integration tests.

I think it should probably go in txn_coord_sender_test.go because it's an integration test and not a unit test. That will mean that you'll have to export keepRefreshSpansOnSavepointRollback.

Copy link
Contributor Author

@miraradeva miraradeva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)


pkg/kv/kvpb/errors.go line 869 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Did you mean to change this line?

Yes, I changed it because it added an extra space in the error message: (RETRY_SERIALIZABLE<space><space>- failed preemptive refresh.

Previously, when a read occurred between SAVEPOINT and ROLLBACK TO
SAVEPOINT, upon rollback the read was removed from the transaction's
refresh spans. If the transaction's read and write timestamps diverged,
this read would not be refreshed, which could lead to a serializability
violation. Moreover, this behavior diverged from the Postgres behavior,
which considers reads in a subtransaction to belong to the parent
transaction
(https://github.com/postgres/postgres/blob/master/src/backend/storage/lmgr/README-SSI#L461-L467).

The previous behavior was an intentional design choice, which made sense
at the time with the intention of using savepoints to recover from
serialization errors; however, this was never implemented. After some
discussions, we have decided to match the Postgres behavior instead.

This patch adresses the issue by ensuring that all refresh spans
accumulated since a savepoint was created are kept after the savepoint
is rolled back. We don't expect this new behavior to impact customers
because they should already be able to handle serialization errors; in
case any unforeseen customer issues arise, this patch also includes a
private cluster setting to revert to the old behavior.

Fixes: cockroachdb#111228

Release note (sql change): Reads rolled back by savepoints are now
refreshable, matching the Postgres behavior and avoiding potential
serializability violations.
@miraradeva
Copy link
Contributor Author

bors r=nvanbenschoten

@craig
Copy link
Contributor

craig bot commented Nov 28, 2023

Build succeeded:

@craig craig bot merged commit e52d74c into cockroachdb:master Nov 28, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kv: reads rolled back by savepoints are not refreshed
3 participants