Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: fix auto-retries for upgraded transactions #84389

Merged
merged 1 commit into from
Jul 18, 2022

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Jul 13, 2022

This is implemented by adding more state to the conn executor state
machine. Now, it tracks if the open transaction was upgraded from an
implicit to an explicit txn. If so, when doing an auto retry, the
transaction is marked as an implicit again, so that the statements in
the transaction can upgrade it to explicit.

No release note is needed for v22.2, but we should use the following
note when backporting to v22.1.

Placeholder note (bug fix): Fixed a bug where some statements in a batch
would not get executed if the following conditions were met:

  • A batch of statements is sent in a single string.
  • A BEGIN statement appears in the middle of the batch.
  • The enable_implicit_transaction_for_batch_statements session variable
    is set to true. (This defaults to false in v22.1.x)

This bug was introduced in v22.1.2. (#82681)

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss requested review from ZhouXing19 and a team July 13, 2022 20:56
@rafiss rafiss force-pushed the fix-auto-retry-upgraded-txn branch 3 times, most recently from faf9751 to e95ed1a Compare July 14, 2022 20:44
@rafiss
Copy link
Collaborator Author

rafiss commented Jul 15, 2022

@ZhouXing19 this is ready for another look

Copy link
Collaborator

@ZhouXing19 ZhouXing19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Not blockers but just wanted to confirm my understanding.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @ZhouXing19)


pkg/sql/conn_fsm.go line 322 at r3 (raw file):

	},
	// Handle the errors in explicit txns.
	stateOpen{ImplicitTxn: fsm.False, WasUpgraded: fsm.Var("wasUpgraded")}: {

A tiny nit: any reason why we're binding with wasUpgraded rather than WasUpgraded? I saw other usages of fsm.Var() with the var always consistent with the string, such as TxnOpen: fsm.Var("TxnOpen"), so I'm not sure if making the first letter capital matters or not.


pkg/sql/conn_fsm.go line 356 at r3 (raw file):

			// will be executed again and upgrade the transaction back to explicit.
			Description: "Retriable err; will auto-retry",
			Next:        stateOpen{ImplicitTxn: fsm.Var("wasUpgraded"), WasUpgraded: fsm.False},

Not a blocker but just to confirm my understanding: if the current state is upgraded (i.e. after the occurrence of BEGIN), we want to "roll back" to the txn state before the occurrence of BEGIN, which is implicit. But if the current state is explicit and was not upgraded, we just keep this state. Is this correct?


pkg/sql/conn_fsm.go line 419 at r3 (raw file):

		eventSavepointRollback{}: {
			Description: "ROLLBACK TO SAVEPOINT (not cockroach_restart) success",
			Next:        stateOpen{ImplicitTxn: fsm.False, WasUpgraded: fsm.Var("wasUpgraded")},

I'm not sure here: if the save point is before the occurrence of BEGIN (hence WasUpgrade == false), but the current state is after the BEGIN (hence WasUpgrade == true), shouldn't the next state of rollback has WasUpgrade == false?
(Or am I mistaking the meaning of WasUpgraded: fsm.Var("wasUpgraded")? My interpretation is that it looks at a "global" var wasUpgraded that corresponds to the current state, and then passes its value to WasUpgraded. )

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)


pkg/sql/conn_fsm.go line 322 at r3 (raw file):

Previously, ZhouXing19 (Jane Xing) wrote…

A tiny nit: any reason why we're binding with wasUpgraded rather than WasUpgraded? I saw other usages of fsm.Var() with the var always consistent with the string, such as TxnOpen: fsm.Var("TxnOpen"), so I'm not sure if making the first letter capital matters or not.

no, there's no specific reason. i noticed that this was already using "implicitTxn" in lower case for other rules, but i'd be fine to make this caps


pkg/sql/conn_fsm.go line 356 at r3 (raw file):

Previously, ZhouXing19 (Jane Xing) wrote…

Not a blocker but just to confirm my understanding: if the current state is upgraded (i.e. after the occurrence of BEGIN), we want to "roll back" to the txn state before the occurrence of BEGIN, which is implicit. But if the current state is explicit and was not upgraded, we just keep this state. Is this correct?

that's exactly right!


pkg/sql/conn_fsm.go line 419 at r3 (raw file):

Previously, ZhouXing19 (Jane Xing) wrote…

I'm not sure here: if the save point is before the occurrence of BEGIN (hence WasUpgrade == false), but the current state is after the BEGIN (hence WasUpgrade == true), shouldn't the next state of rollback has WasUpgrade == false?
(Or am I mistaking the meaning of WasUpgraded: fsm.Var("wasUpgraded")? My interpretation is that it looks at a "global" var wasUpgraded that corresponds to the current state, and then passes its value to WasUpgraded. )

You do have the correct understanding of WasUpgraded: fsm.Var("wasUpgraded"). however, the reason i wrote it this way is because SAVEPOINT statements are only allowed inside of an explicit transaction. i can add a comment saying this so this is easier to understand

Copy link
Collaborator

@ZhouXing19 ZhouXing19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! LGTM.

This is implemented by adding more state to the conn executor state
machine. Now, it tracks if the open transaction was upgraded from an
implicit to an explicit txn. If so, when doing an auto retry, the
transaction is marked as an implicit again, so that the statements in
the transaction can upgrade it to explicit.

No release note is needed for v22.2, but we should use the following
 note when backporting to v22.1.

Placeholder note (bug fix): Fixed a bug where some statements in a batch
would not get executed if the following conditions were met:
- A batch of statements is sent in a single string.
- A BEGIN statement appears in the middle of the batch.
- The enable_implicit_transaction_for_batch_statements session variable
  is set to true. (This defaults to false in v22.1.x)

This bug was introduced in v22.1.2.

Release note: None
@rafiss rafiss force-pushed the fix-auto-retry-upgraded-txn branch from e95ed1a to 2520565 Compare July 18, 2022 15:18
@rafiss
Copy link
Collaborator Author

rafiss commented Jul 18, 2022

tftr!

bors r=ZhouXing19

@craig
Copy link
Contributor

craig bot commented Jul 18, 2022

Build succeeded:

@craig craig bot merged commit 116c5aa into cockroachdb:master Jul 18, 2022
@rafiss rafiss deleted the fix-auto-retry-upgraded-txn branch July 19, 2022 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants