Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: add refcursor type #111392

Merged
merged 3 commits into from
Oct 5, 2023
Merged

sql: add refcursor type #111392

merged 3 commits into from
Oct 5, 2023

Conversation

DrewKimball
Copy link
Collaborator

@DrewKimball DrewKimball commented Sep 27, 2023

types: remove pg_lsn from postgresPredefinedTypeIssues

This patch removes pg_lsn from the list of predefined types in
postgres that aren't yet supported, since it can now be parsed.

Informs #105130

Release note: None

sql: check unsupported types during schema changes and type/function creation

It is necessary to check the cluster version when building an expression
that references a newly-added type, in order to ensure that all nodes
support that type. Previously, these checks were omitted for casts in
the vectorized engine, expressions for partial index predicates and
check constraints, function parameters and return types, and user-defined
composite types.

This patch adds version-checking for function/procedure parameters and
return types, as well as for user-defined types. It also augments the
type-checking logic for casts and type annotations with a version
check; this handles the remaining cases. The execution-time checks
for cast expressions are left untouched, just in case the new
type-checking logic misses important cases.

For now, these changes only apply to the PG_LSN type, which will
be new in 23.2. A future commit will add support for REFCURSOR, and
will need to use the same checks.

Informs #111560

Release note: None

sql: add support for REFCURSOR data type

This patch adds support for the REFCURSOR type. It is a special type
used to hold the name of a cursor for PLpgSQL, although it can be used
in other contexts (like function return types or column types). REFCURSOR
is a string under the hood, and behaves like the TEXT data type in most
cases.

REFCURSOR has no casts in the pg_cast table; instead, all of its
casts are the "IO" string casts. That means that there is an assignment
cast from every type to REFCURSOR, and an explicit cast from REFCURSOR
to every other type (except other string types, which are again assignment).
See ContextOriginAutomaticIOConversion in cast.go for more info.

This patch also adds REFCURSOR to the checks from the previous commit,
so that statements with REFCURSOR are disallowed unless all nodes
are running v23.2.

Fixes #111560

Release note (sql change): Added support for the REFCURSOR data type.
REFCURSOR is a special string type that is used to handle cursors.
PLpgSQL cursor declarations are required to use a variable of type
REFCURSOR, and the name of a cursor can be passed to and from a
PLpgSQL function or procedure.

@blathers-crl
Copy link

blathers-crl bot commented Sep 27, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@DrewKimball DrewKimball changed the title temp sql: add refcursor type Sep 27, 2023
@DrewKimball DrewKimball marked this pull request as ready for review October 2, 2023 10:12
@DrewKimball DrewKimball requested review from a team as code owners October 2, 2023 10:12
@DrewKimball DrewKimball requested review from herkolategan, smg260, cucaroach, a team, yuzefovich and michae2 and removed request for a team, herkolategan and smg260 October 2, 2023 10:12
Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work on this. Just some questions and suggestions.

It doesn't have to be part of this PR, but during stability period let's try to add REFCURSOR to pkg/sql/sem/eval/cast_test/testdata/types.txt so that we can compare our casts to Postgres' casts of the type.

Reviewed 13 of 13 files at r1, 24 of 24 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @DrewKimball, and @yuzefovich)


pkg/sql/sem/tree/type_check.go line 3431 at r1 (raw file):

func CheckUnsupportedType(ctx context.Context, version clusterversion.Handle, typ *types.T) error {
	if version == nil {
		return nil

When will version be nil?


pkg/sql/logictest/testdata/logic_test/pg_lsn_mixed line 65 at r1 (raw file):


# ----------------------------------------------------------------------
# Verify that PG_LSN is not allowed after upgrading the gateway.

Nice upgrade testcases! Do you think it would be worth adding a testcase that successfully uses PG_LSN after upgrading all nodes and finalizing? (And same comment about the REFCURSOR upgrade testcases?) (I'm willing to accept the argument that this is already handled by the non-upgrade testcases.)


pkg/sql/logictest/testdata/logic_test/refcursor line 87 at r2 (raw file):

(100,bar)

# Function that returns REFCURSOR.

request: Could you also add a testcase with a function that returns a heterogeneous tuple of REFCURSOR and some other types? (Or maybe a composite type that contains REFCURSOR.)

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't have to be part of this PR, but during stability period let's try to add REFCURSOR to pkg/sql/sem/eval/cast_test/testdata/types.txt so that we can compare our casts to Postgres' casts of the type.

Opened an issue: #111618

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cucaroach, @michae2, and @yuzefovich)


pkg/sql/sem/tree/type_check.go line 3431 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

When will version be nil?

Sometimes SemaContext is unset (especially for tests), and sometimes not all the fields are initialized. I added a comment.


pkg/sql/logictest/testdata/logic_test/pg_lsn_mixed line 65 at r1 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Nice upgrade testcases! Do you think it would be worth adding a testcase that successfully uses PG_LSN after upgrading all nodes and finalizing? (And same comment about the REFCURSOR upgrade testcases?) (I'm willing to accept the argument that this is already handled by the non-upgrade testcases.)

I found that even with a retry, that test still occasionally fails with the "not supported" error. There's probably a way we could make it work, but it just doesn't seem worth it when we're already testing elsewhere.


pkg/sql/logictest/testdata/logic_test/refcursor line 87 at r2 (raw file):

Previously, michae2 (Michael Erickson) wrote…

request: Could you also add a testcase with a function that returns a heterogeneous tuple of REFCURSOR and some other types? (Or maybe a composite type that contains REFCURSOR.)

Good idea, Done.

@DrewKimball DrewKimball force-pushed the refcursor branch 3 times, most recently from 7ce3605 to 7f5bd00 Compare October 3, 2023 08:37
@DrewKimball
Copy link
Collaborator Author

@michae2 I had to refactor the SemaContext changes a bit to make bazel happy, could you take another look at that?

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 34 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach, @DrewKimball, and @yuzefovich)


pkg/sql/sem/tree/type_check.go line 3438 at r3 (raw file):

// the current cluster version. If the given version handle is nil,
// CheckUnsupportedType returns nil.
func CheckUnsupportedType(ctx context.Context, semaCtx *SemaContext, typ *types.T) error {

I could be wrong, but it looks like we're always calling this sometime after a call to tree.ResolveType. Would it make sense for this check to be the responsibility of tree.ResolveType, instead of creating this new function? Perhaps by injecting the version into the TypeReferenceResolver (which semaCtx also has)?

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work figuring it all out! I have some nits and questions for your consideration. Perhaps it might be good to get a sign off from our Foundations friends too.

Reviewed 34 of 34 files at r3, 27 of 27 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach and @DrewKimball)


pkg/sql/catalog/colinfo/col_type_info.go line 81 at r4 (raw file):

			}
		}
		if t.Oid() == oid.T_refcursor {

Curious, do we not need the same check for pg_lsn type because it cannot be used in the column definition but refcursor can?


pkg/sql/opt/optbuilder/create_function.go line 310 at r3 (raw file):

	ctx context.Context, semaCtx *tree.SemaContext, expected *types.T, cols []scopeColumn,
) error {
	// The return type must be supported by the current cluster version.

nit: perhaps adjust the comment to also mention that all argument types must be supported too? (Is "cols" arguments to the function?)


pkg/sql/sem/tree/type_check.go line 3430 at r3 (raw file):

type UnsupportedTypeChecker interface {
	// CheckType returns an error if the given type is not supported by the
	// current cluster version. If the given version handle is nil, CheckType

nit: where is the "version handle" coming from? Perhaps mention explicitly that it's an argument to eval.NewUnsupportedTypeChecker.


pkg/sql/types/types.go line 2944 at r3 (raw file):

	"money":         41578,
	"path":          21286,
	"pg_lsn":        -1,

nit: this change seems out of place. IIUC we should've removed it earlier when we introduced support for pg_lsn type, so perhaps extract it into a separate commit?


pkg/sql/types/types.go line 739 at r4 (raw file):

	visibleQCHAR = 9

	visibleREFCURSOR = 202

Why do we need this? Did we allow refcursor type pre-23.2 and it had a different type representation that we need to upgrade now?

(Side note: it seems that we could easily remove all other visible* constants because the oldest version that might be using is 19.1, so the upgrades must have been performed, right?)


pkg/sql/logictest/testdata/logic_test/pg_lsn_mixed line 71 at r3 (raw file):


# ----------------------------------------------------------------------
# Verify that PG_LSN is not allowed after upgrading the gateway.

nit: shouldn't this file also check that once the cluster is upgraded, PG_LSN is allowed? Is that beyond the scope of mixed version tests?

@DrewKimball DrewKimball force-pushed the refcursor branch 4 times, most recently from 577da1e to 9868ce8 Compare October 4, 2023 09:01
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 34 of 34 files at r5, 12 of 12 files at r6, 25 of 25 files at r7, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach and @michae2)


pkg/sql/catalog/colinfo/col_type_info.go line 81 at r4 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

PG_LSN is checked below - it has its own family.

Indeed, nvm.

Copy link
Contributor

@ecwall ecwall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach, @DrewKimball, and @michae2)


pkg/sql/logictest/testdata/logic_test/mixed_version_refcursor line 1 at r7 (raw file):

# LogicTest: cockroach-go-testserver-upgrade-to-master

Can you split this and the other logic tests into subtests?

Also, it looks like some of the test cases are copied - can the logic test configs be merged together into 1 file so that it is harder for the test cases to drift apart in the future? I think we probably need some way of asserting different outcomes for the same statement based on which config ran.

Nice to see cross version tests BTW.

Copy link
Collaborator Author

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @cucaroach, @ecwall, @michae2, and @yuzefovich)


pkg/sql/logictest/testdata/logic_test/mixed_version_refcursor line 1 at r7 (raw file):
Thanks for taking a look.

Can you split this and the other logic tests into subtests?

Done.

Also, it looks like some of the test cases are copied - can the logic test configs be merged together into 1 file so that it is harder for the test cases to drift apart in the future? I think we probably need some way of asserting different outcomes for the same statement based on which config ran.

I tried moving the refcursor tests to the mixed_version_refcursor file after upgrading all nodes, but it caused the test to become flaky since the upgrade doesn't always propagate in time even with the retry directive.

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work!

Reviewed 2 of 27 files at r4, 34 of 34 files at r5, 8 of 12 files at r6, 26 of 26 files at r8, 25 of 25 files at r9, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @cucaroach and @ecwall)


pkg/sql/sem/tree/type_check.go line 3438 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

tree.ResolveType confusingly doesn't always go through the type resolver, and whether it does for the test cases seems to depend on the test configuration. I think it might be because we parse directly into *types.T. I could spend more time figuring it out, but I don't think it's worth it unless this method is really distressing.

Thanks for looking into it. I am fine with your current approach.

This patch removes pg_lsn from the list of predefined types in
postgres that aren't yet supported, since it can now be parsed.

Informs cockroachdb#105130

Release note: None
…creation

It is necessary to check the cluster version when building an expression
that references a newly-added type, in order to ensure that all nodes
support that type. Previously, these checks were omitted for casts in
the vectorized engine, expressions for partial index predicates and
check constraints, function parameters and return types, and user-defined
composite types.

This patch adds version-checking for function/procedure parameters and
return types, as well as for user-defined types. It also augments the
type-checking logic for casts and type annotations with a version
check; this handles the remaining cases. The execution-time checks
for cast expressions are left untouched, just in case the new
type-checking logic misses important cases.

For now, these changes only apply to the `PG_LSN` type, which will
be new in 23.2. A future commit will add support for `REFCURSOR`, and
will need to use the same checks.

Informs cockroachdb#111560

Release note: None
This patch adds support for the `REFCURSOR` type. It is a special type
used to hold the name of a cursor for PLpgSQL, although it can be used
in other contexts (like function return types or column types). `REFCURSOR`
is a string under the hood, and behaves like the `TEXT` data type in most
cases.

`REFCURSOR` has no casts in the `pg_cast` table; instead, all of its
casts are the "IO" string casts. That means that there is an assignment
cast from every type to `REFCURSOR`, and an explicit cast from `REFCURSOR`
to every other type (except other string types, which are again assignment).
See `ContextOriginAutomaticIOConversion` in `cast.go` for more info.

This patch also adds `REFCURSOR` to the checks from the previous commit,
so that statements with `REFCURSOR` are disallowed unless all nodes
are running v23.2.

Fixes cockroachdb#111560

Release note (sql change): Added support for the `REFCURSOR` data type.
`REFCURSOR` is a special string type that is used to handle cursors.
PLpgSQL cursor declarations are required to use a variable of type
`REFCURSOR`, and the name of a cursor can be passed to and from a
PLpgSQL function or procedure.
@DrewKimball
Copy link
Collaborator Author

TFTRs!

bors r+

DrewKimball added a commit to DrewKimball/cockroach that referenced this pull request Oct 5, 2023
This patch integrates the `REFCURSOR` data type (added in cockroachdb#111392)
with the PLpgSQL OPEN, CLOSE, FETCH, and MOVE statements, which now
require the cursor variable to have type `REFCURSOR` instead of `STRING`.
Attempts to use a variable of another type will result in an error
like the following:
```
variable "curs" must be of type cursor or refcursor
```

Informs cockroachdb#109709

Release note: None
@craig
Copy link
Contributor

craig bot commented Oct 5, 2023

Build succeeded:

@craig craig bot merged commit 9b72cc5 into cockroachdb:master Oct 5, 2023
3 checks passed
@DrewKimball DrewKimball deleted the refcursor branch October 5, 2023 06:46
@mgartner
Copy link
Collaborator

mgartner commented Oct 5, 2023

pkg/sql/sem/cast/cast_map.go line 643 at r12 (raw file):

		oidext.T_box2d: {MaxContext: ContextExplicit, origin: ContextOriginAutomaticIOConversion, Volatility: volatility.Immutable},
		oid.T_pg_lsn:   {MaxContext: ContextExplicit, origin: ContextOriginAutomaticIOConversion, Volatility: volatility.Immutable},
		oid.T_bytea:    {MaxContext: ContextExplicit, origin: ContextOriginAutomaticIOConversion, Volatility: volatility.Immutable},

None of these explicit casts are allowed by Postgres, AFAICT. We should remove them.

@mgartner
Copy link
Collaborator

mgartner commented Oct 5, 2023

pkg/sql/sem/cast/cast_map.go line 489 at r12 (raw file):

			origin:         ContextOriginAutomaticIOConversion,
			Volatility:     volatility.Stable,
			VolatilityHint: "INTERVAL to REFCURSOR casts depend on IntervalStyle; consider using to_char(interval)",

I'm not convinced there should be assignment casts from every type to the REFCURSOR type. For example, assigment casts from INTERVAL to REFCUROSR are not allowed in Postgres:

marcus=# CREATE TABLE t (r REFCURSOR);
CREATE TABLE

marcus=# INSERT INTO t VALUES ('1'::INTERVAL);
ERROR:  42804: column "r" is of type refcursor but expression is of type interval
LINE 1: insert into t values ('1'::INTERVAL);
                              ^
HINT:  You will need to rewrite or cast the expression.
LOCATION:  transformAssignedExpr, parse_target.c:594

In fact, assignment casts from any of the string-like types to REFCURSOR are also not supported by Postgres:

marcus=# INSERT INTO t VALUES ('1'::VARCHAR);
ERROR:  42804: column "r" is of type refcursor but expression is of type character varying

marcus=# INSERT INTO t VALUES ('1'::TEXT);
ERROR:  42804: column "r" is of type refcursor but expression is of type text

marcus=# INSERT INTO t VALUES ('1'::BPCHAR);
ERROR:  42804: column "r" is of type refcursor but expression is of type character

marcus=# INSERT INTO t VALUES ('a'::CHAR);
ERROR:  42804: column "r" is of type refcursor but expression is of type character

marcus=# INSERT INTO t VALUES ('a'::NAME);
ERROR:  42804: column "r" is of type refcursor but expression is of type name

The IO casts you're describing allow any type to be implicitly cast to the 5 string-like types:

marcus=# SELECT pg_typeof('a'::TEXT || '1'::REFCURSOR);
 pg_typeof
-----------
 text

So I think we should keep the REFCURSOR -> TEXT/VARCHAR/NAME/BPCHAR/CHAR casts and remove all the rest that were added in this commit.

@mgartner
Copy link
Collaborator

mgartner commented Oct 5, 2023

pkg/sql/types/types_test.go line 782 at r12 (raw file):

		{QChar, InternalType{Family: StringFamily, Oid: oid.T_char, Width: 1, VisibleType: visibleQCHAR}},
		{Name, InternalType{Family: name, Oid: oid.T_name}},
		{RefCursor, InternalType{Family: StringFamily, Oid: oid.T_refcursor}},

I'd argue that this should be a different type family. What's the motivation for re-using the StringFamily?

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Apologies for the late drive-by. I left some comments about the casts and type family.

Reviewed 34 of 34 files at r10, 12 of 12 files at r11, 25 of 25 files at r12, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale)

@DrewKimball
Copy link
Collaborator Author

In fact, assignment casts from any of the string-like types to REFCURSOR are also not supported by Postgres:

Even explicit casts aren't supported... I though I checked this, but clearly not.

I'd argue that this should be a different type family. What's the motivation for re-using the StringFamily?

I originally thought that REFCURSOR basically behaves like text, but it seems it's significantly more restrictive. I'll work on a fix.

DrewKimball added a commit to DrewKimball/cockroach that referenced this pull request Oct 5, 2023
This patch integrates the `REFCURSOR` data type (added in cockroachdb#111392)
with the PLpgSQL OPEN, CLOSE, FETCH, and MOVE statements, which now
require the cursor variable to have type `REFCURSOR` instead of `STRING`.
Attempts to use a variable of another type will result in an error
like the following:
```
variable "curs" must be of type cursor or refcursor
```

Informs cockroachdb#109709

Release note: None
craig bot pushed a commit that referenced this pull request Oct 6, 2023
111815: plpgsql: integrate PLpgSQL cursors with the REFCURSOR data type r=DrewKimball a=DrewKimball

#### plpgsql: integrate PLpgSQL cursors with the REFCURSOR data type

This patch integrates the `REFCURSOR` data type (added in #111392)
with the PLpgSQL OPEN, CLOSE, FETCH, and MOVE statements, which now
require the cursor variable to have type `REFCURSOR` instead of `STRING`.
Attempts to use a variable of another type will result in an error
like the following:
```
variable "curs" must be of type cursor or refcursor
```

Informs #109709

Release note: None

Co-authored-by: Drew Kimball <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: add support for the refcursor data type
6 participants