-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: computed column expressions incorrectly evaluated when dependent on "char" type #69327
Comments
If we look at the KV operations, we can see the problem:
|
Here's a reproduction that is further simplified:
|
Another incredible find by TLP. |
When we insert a STRING it is correctly sliced down to a single character during or before the optbuild step. But when we insert a NAME it is not:
|
It's possible to reproduce with types other than
There's some post-optimization step that truncates the insert value This explains why computed columns are required to reveal the problem. When the type of the computed column is not a type that requires truncation, the wrong value is inserted into a secondary index on the computed column. In v21.1.7, there seems to be another related bug with the
|
It looks like the "related bug" I mentioned above was fixed in #66422. |
I think the TLP error reported here is a duplicate of this issue. Here's a reduced reproduction to test once we have a fix:
|
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) Optbuilder is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, optbuilder will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) Optbuilder is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, optbuilder will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) Optbuilder is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, optbuilder will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) Optbuilder is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, optbuilder will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) Optbuilder is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, optbuilder will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) The optimizer is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, the optimizer will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. Some changes to prepared statement placeholder type inference were required in order to better match Postgres's behavior (this is a best-effort match thus far and there are still inconsistencies). Most notably, widths and precision are no longer inferred for the types of placeholders. The effect of this is that assignment casts will be correctly added by the optimizer in order to make sure that values for placeholders are correctly coerced to the target column type during execution of a prepared insert. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) The optimizer is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, the optimizer will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. Some changes to prepared statement placeholder type inference were required in order to better match Postgres's behavior (this is a best-effort match thus far and there are still inconsistencies). Most notably, widths and precision are no longer inferred for the types of placeholders. The effect of this is that assignment casts will be correctly added by the optimizer in order to make sure that values for placeholders are correctly coerced to the target column type during execution of a prepared insert. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) The optimizer is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, the optimizer will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. Some changes to prepared statement placeholder type inference were required in order to better match Postgres's behavior (this is a best-effort match thus far and there are still inconsistencies). Most notably, widths and precision are no longer inferred for the types of placeholders. The effect of this is that assignment casts will be correctly added by the optimizer in order to make sure that values for placeholders are correctly coerced to the target column type during execution of a prepared insert. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) The optimizer is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, the optimizer will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. Some changes to prepared statement placeholder type inference were required in order to better match Postgres's behavior (this is a best-effort match thus far and there are still inconsistencies). Most notably, widths and precision are no longer inferred for the types of placeholders. The effect of this is that assignment casts will be correctly added by the optimizer in order to make sure that values for placeholders are correctly coerced to the target column type during execution of a prepared insert. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) The optimizer is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, the optimizer will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. Some changes to prepared statement placeholder type inference were required in order to better match Postgres's behavior (this is a best-effort match thus far and there are still inconsistencies). Most notably, widths and precision are no longer inferred for the types of placeholders. The effect of this is that assignment casts will be correctly added by the optimizer in order to make sure that values for placeholders are correctly coerced to the target column type during execution of a prepared insert. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) The optimizer is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, the optimizer will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. Some changes to prepared statement placeholder type inference were required in order to better match Postgres's behavior (this is a best-effort match thus far and there are still inconsistencies). Most notably, widths and precision are no longer inferred for the types of placeholders. The effect of this is that assignment casts will be correctly added by the optimizer in order to make sure that values for placeholders are correctly coerced to the target column type during execution of a prepared insert. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes cockroachdb#69327 Fixes cockroachdb#69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed.
70722: sql: add assignment casts for INSERTs r=mgartner a=mgartner #### sql: rename pkg/sql/sem/tree/casts.go to cast.go Release note: None #### tree: add OID->OID cast map This commit adds a new map that describes valid casts from OID to OID. It introduces three cast contexts: explicit casts, assignment casts, and implicit casts. See the comments for CastContext and castMap for details. This map will enable us to properly follow Postgres's casting behavior. Most immediately, it will allow us to support assignment casts. Future work includes moving volatility in castMap. In the longer term, cast functions can be moved into castMap as well. Release note: None #### sql: set "char" type width to 1 The `"char"` type is a special single-character type. This commit adds a `types.QChar` with a width one. It removes the `types.MakeQChar` function so that it is impossible to create `"char"` types with any other width. Release note: None #### sql: add assignment casts for INSERTs Casts in Postgres are performed in one of three contexts [1]: 1. An explicit context with `CAST(x AS T)` or `x::T`. 2. An assignment context performed implicitly during an INSERT, UPSERT, or UPDATE. 3. An implicit context during the evaluation of an expression. For example the DATE in `'2021-01-02'::DATE < now()` will be implicitly cast to a TIMESTAMPTZ so that the values can be compared. Not all casts can be performed in all contexts. Postgres's pg_cast table lists valid casts and specifies the maximum cast context in which each can be performed. A cast with a max context of explicit can only be performed in an explicit context. A cast with an assignment max context can be performed in an explicit context or an assignment context. A cast with an implicit max context can be performed in all contexts. Much to my personal disappointment and frustration, there are valid casts that are not listed in Postgres's pg_cast table. These casts are called "automatic I/O conversions" and they allow casting most types to and from the string types: TEXT, VARCHAR, CHAR, NAME, and "char" [2]. We cannot determine these casts' maximum cast context from the pg_cast table, so we rely on the documentation which states that conversions to string types are assignment casts and conversions from string types are explicit casts [3]. -- This commit implements assignment casts for INSERTs. Follow up work will implement assignment casts for UPSERTs and UPDATEs. A cast performed in an assignment context has slightly different behavior than the same cast performed in an explicit context. In an assignment context, the cast will error if the width of the value is too large for the given type. In an explicit context, the value will be truncated to match the width. The one exception is assignment casts to the special "char" type which do truncate values. To support different cast behaviors for different contexts, a new built-in, `crdb_internal.assignment_cast` has been introduced. This function takes two arguments: a value and a type. Because SQL does not have first-class types, a type cannot be passed directly to the built-in. Instead, a `NULL` cast to a type is used as a workaround, similar to the `json_populate_record` built-in. For example, an integer can be assignment-cast to a string with: crdb_internal.assignment_cast(1::INT, NULL::STRING) The optimizer is responsible for wrapping INSERT columns with the assignment cast built-in function. If an insert column type `T1` is not identical to the table's corresponding target column type `T2`, the optimizer will check if there is a valid cast from `T1` to `T2` with a maximum context that allows an assignment cast. If there is a such a cast, a projection will wrap the column in the assignment cast built-in function. If there is no such cast, a user error will be produced. Some changes to prepared statement placeholder type inference were required in order to better match Postgres's behavior (this is a best-effort match thus far and there are still inconsistencies). Most notably, widths and precision are no longer inferred for the types of placeholders. The effect of this is that assignment casts will be correctly added by the optimizer in order to make sure that values for placeholders are correctly coerced to the target column type during execution of a prepared insert. The introduction of assignment casts fixes minor bugs and addresses some inconsistencies with Postgres's behavior. In general, INSERTS now successfully cast values to target table column types in more cases. As one example, inserting a string into an integer column now succeeds: CREATE TABLE t (i INT) INSERT INTO t VALUES ('1'::STRING) Prior to this commit there was logic that mimicked assignment casts, but it was not correct. Bugs in the implementation caused incorrect behavior when inserting into tables with computed columns. Most notably, a computed column expression that referenced another column `c` was evaluated with the value of `c` before the assignment cast was performed. This resulted in incorrect values for computed columns in some cases. In addition, assignment casts make the special logic for rounding decimal values in optbuilder obsolete. The builtin function `crdb_internal.round_decimal_values` and related logic in optbuilder will be removed once assignment casts are implemented for UPSERTs and UPDATEs. Fixes #69327 Fixes #69665 [1] https://www.postgresql.org/docs/current/typeconv.html [2] https://www.postgresql.org/docs/13/catalog-pg-cast.html#CATALOG-PG-CAST [3] https://www.postgresql.org/docs/13/sql-createcast.html#SQL-CREATECAST-NOTES Release note (sql change): Implicit casts performed during INSERT statements now more closely follow Postgres's behavior. Several minor bugs related to these types of casts have been fixed. 70950: instancestorage: Add SQL test for sql_instances. r=knz a=rimadeodhar This PR adds a unit test to verify that the sql_instances table can be accessed through SQL API. Release note: None 71412: sqlproxyccl: rework sqlproxy connection throttler r=JeffSwenson a=JeffSwenson This change switches the sqlproxy connection throttling logic back to exponential backoff. The tokenbucket approach was introduced by PR #69041. There are a few behavior differences between this and the original exponential backoff implementation. 1. The throttling logic is maintained per (ip, tenant) instead of per (ip). Some platform as a service provides share a single outbound ip address between multiple clients. These users would occasionaly see throttling caused by a second user sharing their IP. 2. The throttling logic was triggered before there was an authentication failure. It takes ~100ms-1000ms to authenticate with the tenant process. Any requests that arrived after the first request, but before it was processed, would trigger the throttle. Now, we only trigger the throttle in response to an explict authorization error. Release note: None Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: rimadeodhar <[email protected]> Co-authored-by: Jeff <[email protected]>
TLP found a correctness bug (originally reported here) that I've reduced below. I think the bug is related to the internal types
"char"
andNAME
and virtual computed columns and indices. I wasn't able to reproduce with other string-like types or a stored computed column.Postgres docs for
"char"
andNAME
datatypes: https://www.postgresql.org/docs/9.5/datatype-character.html#DATATYPE-CHARACTER-SPECIAL-TABLEThe text was updated successfully, but these errors were encountered: