Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: make sure pgwire bind always happens in a transaction #71632

Merged
merged 1 commit into from
Oct 22, 2021

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Oct 15, 2021

fixes #70378
and maybe #64140

The approach of using a transaction matches what we do for pgwire Parse
messages already. This is important to make sure that user-defined types
are leased correctly.

This also updated the SQL EXECUTE command to resolve user-defined types
so that it gets the latest changes.

Release note (bug fix): Adding new values to a user-defined enum type
will previously would cause a prepared statement using that type to not
work. This now works as expected.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss force-pushed the fix-bind-enum-types branch from 393eb61 to 7d77e1e Compare October 16, 2021 23:16
@rafiss rafiss requested review from ajwerner and a team October 16, 2021 23:16
@rafiss rafiss marked this pull request as ready for review October 17, 2021 00:28
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 6 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)


pkg/sql/execute.go, line 48 at r1 (raw file):

		// For user-defined types, we need to resolve the type to make sure we get
		// the latest changes to the type.

How does this help us in the case where the type has changed between bind and now?

Copy link
Collaborator Author

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/execute.go, line 48 at r1 (raw file):

Previously, ajwerner wrote…
		// For user-defined types, we need to resolve the type to make sure we get
		// the latest changes to the type.

How does this help us in the case where the type has changed between bind and now?

the code here is for the EXECUTE sql command. binding the placeholders and executing the query is always going to be in the same implicit transaction. in other words:

  • PREPARE: creates the prepared statement
  • EXECUTE: binds parameters and runs the query in the same implicit txn

but if you're asking about the type changing between the pgwire Bind and the pgwire Exec (which is the code changed in conn_executor_prepare.go), this commit still helps. previously, the Bind would use whichever previous txn happened to be on the conn_executor, and now it gets its own txn. i don't see why Bind and Exec should share the same transaction. first of all, if anyone does change the type in between Bind and Exec, that seems like more of an edge case than what's being fixed here. but even so, it sounds correct to me that the type resolution during Bind should use the version of the type as it was at the time of the Bind command

but maybe i misunderstand you. if so, could you type out the sequence of pgwire commands and type changes that you're asking about?

@ajwerner
Copy link
Contributor

i don't see why Bind and Exec should share the same transaction. first of all, if anyone does change the type in between Bind and Exec, that seems like more of an edge case than what's being fixed here.

I agree fully that they should not be using the same transaction. This change is good. I think the hazard is that if there's a change in the type (or, more likely, in the regclass) between the Bind and Exec then we'll use the old value.

Consider the below contrived example:

SELECT * FROM system.namespace WHERE id = $1::regclass;

Now, consider the argument is a table name and that between bind and exec we rename the table. Nothing will tell us we got it wrong. I think we can paper over that, but it's interesting. For enum values, we're a bit better off because it's highly likely that the execution plan tracked the version of the enum it used and when we check memo.IsStale we'll notice it is and re-plan. Anyway, I think that's all #64140 and isn't intended to be fixed by this.

My question was to check my understanding. Thanks

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 18, 2021

Ah thanks for explaining.

I think the hazard is that if there's a change in the type (or, more likely, in the regclass) between the Bind and Exec then we'll use the old value.

I guess I'm not really convinced that it's incorrect to use the old value.

But also, turns out the PostgreSQL spec doesn't allow anything to happen in between Bind and Exec.

From https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-FLOW-EXT-QUERY

If successfully created, a named portal object lasts till the end of the current transaction

In other words, after Sync, the portal that was bound is destroyed. Indeed, if you try this test against Postgres, it results in portal "p8" does not exist

send
Parse {"Name": "s8", "Query": "SELECT oid, relname FROM pg_class WHERE oid = $1::regclass;"}
Bind {"DestinationPortal": "p8", "PreparedStatement": "s8", "ResultFormatCodes": [0], "Parameters": [{"text":"t"}]}
Sync
----

until
ReadyForQuery
----
{"Type":"ParseComplete"}
{"Type":"BindComplete"}
{"Type":"ReadyForQuery","TxStatus":"I"}

send
Execute {"Portal": "p8"}
Sync
----

until
ReadyForQuery
----
{"Type":"DataRow","Values":[{"text":"52"},{"text":"t"}]}
{"Type":"CommandComplete","CommandTag":"SELECT 1"}
{"Type":"ReadyForQuery","TxStatus":"I"}

@rafiss
Copy link
Collaborator Author

rafiss commented Oct 18, 2021

Filed #71665

@rafiss rafiss force-pushed the fix-bind-enum-types branch from 7d77e1e to 9e590ec Compare October 18, 2021 17:58
@rafiss
Copy link
Collaborator Author

rafiss commented Oct 18, 2021

Ah, but we actually can get a test like the one that you're asking for by using an explicit transaction. Added one to bind_and_resolve and confirmed that we are matching the PostgreSQL behavior.

hm i guess it's actually still different from #64140 since that issue is only talking about non-explicit txns. well, still a good test to have i think.

@ajwerner
Copy link
Contributor

Ah, but we actually can get a test like the one that you're asking for by using an explicit transaction. Added one to bind_and_resolve and confirmed that we are matching the PostgreSQL behavior.

In an explicit transaction, won't all three steps will be using the same transaction. We'll bypass

// This is a huge kludge to deal with the fact that we're resolving types
// using a planner with a committed transaction. This ends up being almost
// okay because the execution is going to re-acquire leases on these types.
// Regardless, holding this lease is worse than not holding it. Users might
// expect to get type mismatch errors if a rename of the type occurred.
if ex.getTransactionState() == NoTxnStateStr {
ex.planner.Descriptors().ReleaseAll(ctx)
}
and the case I think we're currently talking about.

I guess I'm not really convinced that it's incorrect to use the old value.

I sort of hear that. The reason I'd say it is is that we say that once return from a rename, we claim that no transaction which happens after (has a higher commit timestamp) will observe the old name. It's possible in this edge case. I think it'd be a problem in postgres if you ran under serializable.

But also, turns out the PostgreSQL spec doesn't allow anything to happen in between Bind and Exec.

My claim is regarding something happening of a different connection between bind and exec. The rename would occur on a separate connection, concurrently, between the bind and exec.

@rafiss rafiss force-pushed the fix-bind-enum-types branch 2 times, most recently from 9db9e35 to f9becb3 Compare October 21, 2021 16:58
@rafiss rafiss requested a review from otan October 21, 2021 16:58

// For user-defined types, we need to resolve the type to make sure we get
// the latest changes to the type.
if _, ok := types.OidToType[typ.Oid()]; !ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe if typ.UserDefined() {?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

# TODO(ajwerner): Why are there two ReadyForQuery?

until
# There are two ReadyForQuerys because a simply query was followed by Sync
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: full stop at end of comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -38,15 +38,13 @@ ReadyForQuery
# planner transaction but never set it. This was pretty much the only
# way you could do such a thing.

send
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you comment on why this is crdb_only?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

The approach of using a transaction matches what we do for pgwire Parse
messages already. This is important to make sure that user-defined types
are leased correctly.

This also updated the SQL EXECUTE command to resolve user-defined types
so that it gets the latest changes.

Release note (bug fix): Adding new values to a user-defined enum type
will previously would cause a prepared statement using that type to not
work. This now works as expected.
@rafiss rafiss force-pushed the fix-bind-enum-types branch from f9becb3 to 01ca1e5 Compare October 22, 2021 16:01
@rafiss
Copy link
Collaborator Author

rafiss commented Oct 22, 2021

flaky CI

tftr!

bors r=otan

@craig
Copy link
Contributor

craig bot commented Oct 22, 2021

Build succeeded:

@craig craig bot merged commit df06437 into cockroachdb:master Oct 22, 2021
@rafiss rafiss deleted the fix-bind-enum-types branch November 3, 2021 21:51
@nvanbenschoten
Copy link
Member

@rafiss I'm not familiar with this change, but I noticed in passing while looking at a few benchmarks that it made connExecutor.execBind more expensive. In a read-only, CPU-bound workload, execBind is now responsible for 4.2% of CPU utilization. Most of that can be attributed to the cost of creating a new short-lived transaction, which is a relatively heavy-weight operation compared to everything else in execBind.

Screen Shot 2021-12-30 at 6 21 17 PM

I know this was a bug fix and I unfortunately don't have any good suggestions (beyond #74350) to make this cheaper, but I figured I'd mention it here in case there were alternative fixes that didn't require a separate txn.

@rafiss
Copy link
Collaborator Author

rafiss commented Jan 4, 2022

One optimization potential here is that we only need to create/use a transaction if we end up calling ResolveTypeByOID -- that is, if bindCmd.Args is non-empty and contains user-defined types.

@rafiss
Copy link
Collaborator Author

rafiss commented Jan 6, 2022

@nvanbenschoten I made that optimization here: #74423

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: prepared statements against tables with enum types not invalidated when enum changes
5 participants