Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: don't rewind to BEGIN when retrying txn #82622

Merged
merged 1 commit into from
Jun 9, 2022

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Jun 8, 2022

fixes #82392

Release note (bug fix): Fixed a bug where CockroachDB would
sometimes automatically retry the BEGIN statement of an explicit
transaction.

@rafiss rafiss requested review from ZhouXing19 and a team June 8, 2022 18:27
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@ZhouXing19 ZhouXing19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, thanks!

@rafiss rafiss force-pushed the fix-implicit-txn-rewind-pos branch 2 times, most recently from d825aa2 to 2ab7292 Compare June 8, 2022 22:29
@rafiss
Copy link
Collaborator Author

rafiss commented Jun 8, 2022

cc @ZhouXing19 i added the test and changed the implementation a little. please take a look whenever you have time

Copy link
Collaborator

@ZhouXing19 ZhouXing19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with TestingKnobs (and they look so cool!) so I'm a bit confused: did we simulate retrying a txn? It seems to me that there's only one conn in the test, and SELECT 1; BEGIN; INSERT INTO foo VALUES(1); COMMIT; only ran once, and it will always pass without retrying.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @ZhouXing19)


pkg/sql/conn_executor_test.go line 767 at r1 (raw file):

			_, tableID, err := keys.SystemSQLCodec.DecodeTablePrefix(put.Key)
			if err != nil || tableID != fooTableId {
				err = nil

nit: we can just return nil here, I think


pkg/sql/conn_executor_test.go line 772 at r1 (raw file):

			if atomic.AddInt64(&failed, 1) <= numToFail {
				return roachpb.NewErrorWithTxn(
					roachpb.NewTransactionRetryError(roachpb.RETRY_REASON_UNKNOWN, "boom"), ba.Txn,

Maybe a more informative error message?

@ZhouXing19
Copy link
Collaborator

pkg/sql/conn_executor_test.go line 758 at r1 (raw file):

	testDB.QueryRow(t, "SELECT 'foo'::regclass::oid").Scan(&fooTableId)

	// Inject an error that will happen during execution.

I'm not sure if I understand the logic here: is it counting queries that are doing "conditional put", i.e. set the same value to a key in fooTable?. And when the count > 2, it raise an error?

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with TestingKnobs (and they look so cool!) so I'm a bit confused: did we simulate retrying a txn? It seems to me that there's only one conn in the test, and SELECT 1; BEGIN; INSERT INTO foo VALUES(1); COMMIT; only ran once, and it will always pass without retrying.

good questions!

the way testing knobs work is that it lets you "inject" custom functionality into different parts of the system. you can view the usages of the knob to see where the behavior gets injected.

in this case, the behavior i'm injecting is to add a retryable error, which will cause the conn executor to retry the transaction. look for the canAutoRetry and eventRetriableErr usages in the conn executor.

but to make sure it retried, i should add one more assertion that the retry count was actually incremented.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)


pkg/sql/conn_executor_test.go line 758 at r1 (raw file):

Previously, ZhouXing19 (Jane Xing) wrote…

I'm not sure if I understand the logic here: is it counting queries that are doing "conditional put", i.e. set the same value to a key in fooTable?. And when the count > 2, it raise an error?

"conditional put" is the name of the KV request used to implement INSERT statements.

this actually will raise the error when the count is less than two


pkg/sql/conn_executor_test.go line 767 at r1 (raw file):

Previously, ZhouXing19 (Jane Xing) wrote…

nit: we can just return nil here, I think

fixed


pkg/sql/conn_executor_test.go line 772 at r1 (raw file):

Previously, ZhouXing19 (Jane Xing) wrote…

Maybe a more informative error message?

nobody actually will see this error message, but i will make it more readable for people looking at the code

Release note (bug fix): Fixed a bug where CockroachDB would
sometimes automatically retry the BEGIN statement of an explicit
transaction.
@rafiss rafiss force-pushed the fix-implicit-txn-rewind-pos branch from 2ab7292 to 32c4985 Compare June 9, 2022 00:59
Copy link
Collaborator

@ZhouXing19 ZhouXing19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the explanation!
:lgtm_strong:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ZhouXing19)

@rafiss
Copy link
Collaborator Author

rafiss commented Jun 9, 2022

tftr!

bors r=ZhouXing19

@craig
Copy link
Contributor

craig bot commented Jun 9, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: queries with ON CONFLICT DO UPDATE SET seem to interfere each other when run in parallel
3 participants