Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: mark timezone(string, string) as VolatilityStable #48756

Merged

Conversation

mgartner
Copy link
Collaborator

The timezone(string, string) function overload was incorrectly marked
as VolatilityImmutable. When the second argument a special string,
like 'now', 'today', 'tomorrow', or 'yesterday', the function is
non-immutable because those strings are cast to timestamps which are
dependent on the current clock time.

This commit correctly marks it as VolatilityStable.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@maddyblue maddyblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should some of the overloads marked volatile be bumped up to stable?

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it should be immutable. the timezone function should not be affected by any external factors.

also subverts postgres expectations:


otan=# select proname, proargtypes, provolatile from pg_proc where proname = 'timezone';
 proname  | proargtypes | provolatile
----------+-------------+-------------
 timezone | 1186 1184   | i
 timezone | 25 1184     | i
 timezone | 25 1266     | v -- this seems incorrect.
 timezone | 1186 1266   | i
 timezone | 25 1114     | i
 timezone | 1186 1114   | i
(6 rows)

otan=# select * from typname where oid=1266;
ERROR:  relation "typname" does not exist
LINE 1: select * from typname where oid=1266;
                      ^
otan=# select * from pg_type where oid=1266;
otan=# select typname from pg_type where oid=1266;
 typname
---------
 timetz
(1 row)

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

huh, my mistake - looks like we accept arguments like 'now', which i missed in the PR description. maybe postgres has this wrong. we should probably bump everything up and add a whitelist for allowed postgres differences in the builtins_test which ensure volatility matches postgres..

@RaduBerinde
Copy link
Member

@otan the overload we are talking about takes two string arguments. Based on your query above, Postgres doesn't have that.

I am not 100% sure why we have it but I am guessing that we need this overload because we don't do the implicit casts that postgres does. It's the string->timestamp cast itself that is non-immutable.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @RaduBerinde)


pkg/sql/sem/builtins/builtins.go, line 2419 at r1 (raw file):

			},
			Info:       "Convert given time stamp with time zone to the new time zone, with no time zone designation.",
			Volatility: tree.VolatilityStable,

It's worth adding a comment - mention that parsing a string timestamp is not immutable in general because of things like 'now'.

@mgartner
Copy link
Collaborator Author

mgartner commented May 13, 2020

Should some of the overloads marked volatile be bumped up to stable?

That's a great question. I'm not sure, so I went down a rabbit hole...

Interestingly, we define both a TIME and TIMETZ overload, while Postgres only defines one for TIMETZ that is marked as volatile.

timezone=# \df+ timezone
                                                                                                             List of functions
   Schema   |   Name   |      Result data type       |          Argument data types          | Type | Volatility | Parallel | Owner  | Security | Access privileges | Language |    Source code    |              Description
------------+----------+-----------------------------+---------------------------------------+------+------------+----------+--------+----------+-------------------+----------+-------------------+----------------------------------------
 pg_catalog | timezone | time with time zone         | interval, time with time zone         | func | immutable  | safe     | marcus | invoker  |                   | internal | timetz_izone      | adjust time with time zone to new zone
 pg_catalog | timezone | timestamp without time zone | interval, timestamp with time zone    | func | immutable  | safe     | marcus | invoker  |                   | internal | timestamptz_izone | adjust timestamp to new time zone
 pg_catalog | timezone | timestamp with time zone    | interval, timestamp without time zone | func | immutable  | safe     | marcus | invoker  |                   | internal | timestamp_izone   | adjust timestamp to new time zone
 pg_catalog | timezone | time with time zone         | text, time with time zone             | func | volatile   | safe     | marcus | invoker  |                   | internal | timetz_zone       | adjust time with time zone to new zone
 pg_catalog | timezone | timestamp without time zone | text, timestamp with time zone        | func | immutable  | safe     | marcus | invoker  |                   | internal | timestamptz_zone  | adjust timestamp to new time zone
 pg_catalog | timezone | timestamp with time zone    | text, timestamp without time zone     | func | immutable  | safe     | marcus | invoker  |                   | internal | timestamp_zone    | adjust timestamp to new time zone
(6 rows)

I searched the PG source code to determine why timetz_zone is marked as volatile and not stable. The only thing I found was this commit.

Given its current definition that depends on time(NULL), timetz_zone is certainly no longer immutable, but must indeed be marked volatile. I wonder if it should use the value of now() (that is, transaction start time) so that it could be marked stable. But it's probably not important enough to be worth changing the code for ... indeed, I'm not even going to force an initdb for this catalog change, seeing that we just did one a few hours ago.

It looks like it's marked as volatile because it's dependent on time(NULL), rather than the transaction start time, and therefore (I'm assuming) the return value can change during query execution.

I'm not sure if this distinction applies in the context of CockroachDB. In our case, I don't think now() is even equal to the timestamp at the start of the transaction. So I'm thinking that it's fine to mark these TIME and TIMETZ overloads as stable.

But I'd like input from someone who knows more. @knz I've seen your name come up quite a bit with regard to some of the time-related functions. Do you have any input here?

@mgartner
Copy link
Collaborator Author

@otan I thought it was also incorrect that it's marked volatile in Postgres (and was excited to have found a bug), but it seems corrected, based on the commit and code I found above.

@otan
Copy link
Contributor

otan commented May 13, 2020

@otan the overload we are talking about takes two string arguments. Based on your query above, Postgres doesn't have that.

postgres doesn't have a (string, string) version because of its type resolution system. as the preferred type for the time type category is TIMESTAMPTZ, it resolves the overload (unknown, unknown) as (timestamptz, string). See https://docs.google.com/document/d/1DUUjKtFhnXDbHTmetl5qD2t9AbIi0gOu3fM_E9LwcDo/edit for the type resolution algorithm in more detail.

the reason we have (string, string) is because we don't have preferred types / type categories, and i had to insert that in (it would be ambiguous, and that breaks some usages of AT TIME ZONE). (string, string) should technically be deleted from CRDB when we have the same resolution system.

if we change (string, string) to stable / volatile, we should do the same thing for timestamptz, string and the others.

tl;dr in CRDB (string, string) masquerades as (timestamptz, string) and is there for compatibility with the psql type system. psql seems to define this as immutable - i'm not sure that's right. if we're going down this path, we should change all of them and fix the test to whitelist these deviations.

@RaduBerinde
Copy link
Member

tl;dr in CRDB (string, string) masquerades as (timestamptz, string) and is there for compatibility with the psql type system. psql seems to define this as immutable

This is consistent with what I've been saying. The overloads that operate on timestamp inputs are immutable. They always give the same result for the same inputs. What is not immutable is the interpretation of a string as a timestamp. In postgres that has nothing to do with evaluating the timestamp function - by the time this function is evaluated, we already have a timestamp. That is why IMO our timestamp(string, string) variant (and only that) should be marked as "stable" - it is special, because it incorporates the parsing of the string.

@otan
Copy link
Contributor

otan commented May 13, 2020

edit: nvm i'm a bad reader :D carry on

Copy link
Contributor

@otan otan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for the back and forth, missed a few layers of confusingness in between (missed that the cast happens before the function is evaluated). LGTM

@maddyblue
Copy link
Contributor

It looks like it's marked as volatile because it's dependent on time(NULL), rather than the transaction start time, and therefore (I'm assuming) the return value can change during query execution.

I'm not sure if this distinction applies in the context of CockroachDB. In our case, I don't think now() is even equal to the timestamp at the start of the transaction. So I'm thinking that it's fine to mark these TIME and TIMETZ overloads as stable.

Interesting! Thanks for doing that research. I'm ok with whatever decision we make, but at least now it seems that we could safely mark them as stable, even if we choose not to for compat or other reasons. (And definitely not something that needs to be done in this PR.)

@mgartner mgartner force-pushed the fix-timezone-string-string-volatility branch from 5266f74 to 7ddfb99 Compare May 13, 2020 16:14
@mgartner
Copy link
Collaborator Author

Interesting! Thanks for doing that research. I'm ok with whatever decision we make, but at least now it seems that we could safely mark them as stable, even if we choose not to for compat or other reasons. (And definitely not something that needs to be done in this PR.)

For now I've left TODOs above each overload so we don't lose this train-of-thought. I'll try to confirm that it's safe to mark them as stable and follow up in a future PR.

Info: "Treat given time without time zone as located in the specified time zone.",
Info: "Treat given time without time zone as located in the specified time zone.",
// TODO(mgartner): This overload might stable, not volatile.
// See: https://github.com/cockroachdb/cockroach/pull/48756#issuecomment-627672686
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should fail

postgresVolatility, found := findOverloadVolatility(name, overload)
and you'll need to whitelist this difference from postgres.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(setting IgnoreVolatilityCheck: true should do it)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. Thanks for the heads-up!

@mgartner mgartner force-pushed the fix-timezone-string-string-volatility branch from 7ddfb99 to a173b8a Compare May 13, 2020 16:42
The `timezone(string, string)` function overload was incorrectly marked
as `VolatilityImmutable`. When the second argument a special string,
like 'now', 'today', 'tomorrow', or 'yesterday', the function is
non-immutable because those strings are cast to timestamps which are
dependent on the current clock time.

This commit correctly marks it as `VolatilityStable`.

Release note: None
@mgartner mgartner force-pushed the fix-timezone-string-string-volatility branch from a173b8a to 77a5fca Compare May 13, 2020 17:41
@mgartner
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented May 13, 2020

Build failed (retrying...)

@craig
Copy link
Contributor

craig bot commented May 13, 2020

Build failed

@mgartner
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented May 14, 2020

Build succeeded

@craig craig bot merged commit 274c789 into cockroachdb:master May 14, 2020
@mgartner mgartner deleted the fix-timezone-string-string-volatility branch May 14, 2020 00:52
mgartner added a commit to mgartner/cockroach that referenced this pull request Jul 7, 2020
The `timezone(STRING, TIME)` and `timezone(STRING, TIMETZ)` function
overloads were originally marked as volatile to copy Postgres's
volatility for the `timezone(STRING, TIMETZ)` overload (Postgres does
not implement `timezone (STRING, TIME)`.

Postgres's volatility settings for all overloads of `timezone` are
below.

       Name   |          Argument data types          | Volatility
    ----------+---------------------------------------+------------------
     timezone | interval, time with time zone         | immutable   zone
     timezone | interval, timestamp with time zone    | immutable
     timezone | interval, timestamp without time zone | immutable
     timezone | text, time with time zone             | volatile    zone
     timezone | text, timestamp with time zone        | immutable
     timezone | text, timestamp without time zone     | immutable

This single overload is singled out as volatile by Postgres because it
is dependent on C's `time(NULL)`. See the Postgres commit originally
marking this overload volatility, and the lines of code showing the
dependence:

postgres/postgres@35979e6
https://github.com/postgres/postgres/blob/ac3a4866c0bf1d7a14009f18d3b42ffcb063a7e9/src/backend/utils/adt/date.c#L2816

CRDB does not have the same issue. All `timezone` overloads have the
same volatility, stable. This commit marks them as such.

Additional context: cockroachdb#48756 (comment)

Release note (<category, see below>): <what> <show> <why>
mgartner added a commit to mgartner/cockroach that referenced this pull request Jul 7, 2020
The `timezone(STRING, TIME)` and `timezone(STRING, TIMETZ)` function
overloads were originally marked as volatile to copy Postgres's
volatility for the `timezone(STRING, TIMETZ)` overload (Postgres does
not implement `timezone (STRING, TIME)`.

Postgres's volatility settings for all overloads of `timezone` are
below.

       Name   |          Argument data types          | Volatility
    ----------+---------------------------------------+------------------
     timezone | interval, time with time zone         | immutable   zone
     timezone | interval, timestamp with time zone    | immutable
     timezone | interval, timestamp without time zone | immutable
     timezone | text, time with time zone             | volatile    zone
     timezone | text, timestamp with time zone        | immutable
     timezone | text, timestamp without time zone     | immutable

This single overload is singled out as volatile by Postgres because it
is dependent on C's `time(NULL)`. See the Postgres commit originally
marking this overload volatility, and the lines of code showing the
dependence:

postgres/postgres@35979e6
https://github.com/postgres/postgres/blob/ac3a4866c0bf1d7a14009f18d3b42ffcb063a7e9/src/backend/utils/adt/date.c#L2816

CRDB does not have the same issue. All `timezone` overloads have the
same volatility, stable. This commit marks them as such.

Additional context: cockroachdb#48756 (comment)

This commit also moves the `IgnoreVolatilityCheck` field from function
definitions to overloads. This allows overloads of the same function to
be ignored when asserting that the volalitity of an overload matches
Postgres's assigned volatility.

Release note: None
craig bot pushed a commit that referenced this pull request Jul 7, 2020
50949: backupccl: handle sequence ownership remapping logic for restore r=pbardea a=arulajmani

Previously, we would restore sequence descriptors/table descriptors as
is, without remapping the ownership dependency to the new IDs. This
meant sequence ownership was broken post restore. This PR fixes that.

When sequences with owners or tables that have columns that own
sequences are restored:
- If both the owned sequence and the owning table are being restored,
the ownership is remapped.
- If just the sequence is restored, the user receives an error to use
the `skip_missing_sequence_owners` flag. If the flag is provided, the
sequence is restored without any owner.
- If just the table is restored, the user receives an error to use the
`skip_missing_sequence_owners` flag. If the flag is provided, the table
is restored with the table column that previously owned the sequence no
longer owning that sequence.

Fixes #50781

Release note (enterprise change): Restore now has a new option
`skip_missing_sequence_owners` that must be supplied when restoring
only the table/sequence that was previously
a sequence owner/owned by a table.
Additionally, this fixes a bug where ownership relationships would not
be remapped after a restore.

51005: build: fix package reported by acceptance and compose r=otan a=tbg

The acceptance and compose tests were previously reporting `emptypkg` as
their package, which would be used to create issues.

We need to pass the "real" package to go so that it shows up in the
JSON output.

Closes #51003.

Release note: None

51037: sql: change volatility of timezone with TIME and TIMETZ to stable r=mgartner a=mgartner

sql: change volatility of timezone with TIME and TIMETZ to stable

The `timezone(STRING, TIME)` and `timezone(STRING, TIMETZ)` function
overloads were originally marked as volatile to copy Postgres's
volatility for the `timezone(STRING, TIMETZ)` overload (Postgres does
not implement `timezone (STRING, TIME)`.

Postgres's volatility settings for all overloads of `timezone` are
below.

       Name   |          Argument data types          | Volatility
    ----------+---------------------------------------+------------------
     timezone | interval, time with time zone         | immutable   zone
     timezone | interval, timestamp with time zone    | immutable
     timezone | interval, timestamp without time zone | immutable
     timezone | text, time with time zone             | volatile    zone
     timezone | text, timestamp with time zone        | immutable
     timezone | text, timestamp without time zone     | immutable

This single overload is singled out as volatile by Postgres because it
is dependent on C's `time(NULL)`. See the Postgres commit originally
marking this overload volatility, and the lines of code showing the
dependence:

postgres/postgres@35979e6
https://github.com/postgres/postgres/blob/ac3a4866c0bf1d7a14009f18d3b42ffcb063a7e9/src/backend/utils/adt/date.c#L2816

CRDB does not have the same issue. All `timezone` overloads have the
same volatility, stable. This commit marks them as such.

Additional context: #48756 (comment)

This commit also moves the `IgnoreVolatilityCheck` field from function
definitions to overloads. This allows overloads of the same function to
be ignored when asserting that the volalitity of an overload matches
Postgres's assigned volatility.

Release note: None


51071: roachtest: fix acceptance/decommission r=darinpp a=darinpp

The added --self flag makes the test fail with 20.1 and 19.2

With 20.1 the flag is not supported and so the expected
result should reflect that the step where a node decommissions self
will not happen.
With 19.2 the match that --self is not supported doesn't work
as the error message capitalization differs.

The fix changes the expected result and corrects the match of --self not
supported to work with 19.2 as well

Release note: None

Co-authored-by: arulajmani <[email protected]>
Co-authored-by: Tobias Schottdorf <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
Co-authored-by: Darin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants