-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: selecting from TimeTZ index may miss values in other time zones #74912
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
S-3-erroneous-edge-case
Database produces or stores erroneous data without visible error/warning, in rare edge cases.
T-sql-queries
SQL Queries Team
Comments
rytaft
added
the
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
label
Jan 17, 2022
rytaft
added
S-0-visible-logical-error
Database stores inconsistent data in some cases, or queries return invalid results silently.
S-3-erroneous-edge-case
Database produces or stores erroneous data without visible error/warning, in rare edge cases.
and removed
S-0-visible-logical-error
Database stores inconsistent data in some cases, or queries return invalid results silently.
labels
Jan 17, 2022
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Jan 17, 2022
Prior to this commit, the DTimeTZ functions Next() and Prev() could skip over valid values according to the ordering of DTimeTZ values in an index (which matches the ordering defined by the TimeTZ functions After() and Before()). This commit fixes these functions so that Next() now returns smallest valid DTimeTZ that is greater than the receiver, and Prev() returns the largest valid DTimeTZ that is less than the receiver. This is an important invariant that the optimizer relies on when building index constraints. Fixes cockroachdb#74912 Release note (bug fix): Fixed a bug that could occur when a TIMETZ column was indexed, and a query predicate constrained that column using a < or > operator with a timetz constant. If the column contained values with time zones that did not match the time zone of the timetz constant, it was possible that not all matching values could be returned by the query. Specifically, the results may not have included values within one microsecond of the predicate's absolute time. This bug was introduced when the timetz datatype was first added in 20.1. It exists on all versions of 20.1, 20.2, 21.1, and 21.2 prior to this patch.
craig bot
pushed a commit
that referenced
this issue
Jan 19, 2022
74863: import: check readability earlier r=benbardin a=benbardin Release note (sql change): Import now checks readability earlier for multiple files, to fail sooner if e.g. permissions are invalid. 74914: opt,tree: fix bugs with Next(), Prev(), and histogram calculation for DTimeTZ r=rytaft a=rytaft **sql/sem/tree: fix Next() and Prev() for DTimeTZ** Prior to this commit, the `DTimeTZ` functions `Next()` and `Prev()` could skip over valid values according to the ordering of `DTimeTZ` values in an index (which matches the ordering defined by the `TimeTZ` functions `After()` and `Before()`). This commit fixes these functions so that `Next()` now returns the smallest valid `DTimeTZ` that is greater than the receiver, and `Prev()` returns the largest valid `DTimeTZ` that is less than the receiver. This is an important invariant that the optimizer relies on when building index constraints. Fixes #74912 Release note (bug fix): Fixed a bug that could occur when a `TIMETZ` column was indexed, and a query predicate constrained that column using a `<` or `>` operator with a `timetz` constant. If the column contained values with time zones that did not match the time zone of the `timetz` constant, it was possible that not all matching values could be returned by the query. Specifically, the results may not have included values within one microsecond of the predicate's absolute time. This bug was introduced when the timetz datatype was first added in 20.1. It exists on all versions of 20.1, 20.2, 21.1, and 21.2 prior to this patch. **opt: fix bug in histogram calculation for TimeTZ** This commit fixes a bug in the histogram estimation code for `TimeTZ` that made the faulty assumption that `TimeTZ` values are ordered by `TimeOfDay`. This is incorrect since it does not take the `OffsetSecs` into account. As a result, it was possible to estimate that the size of a histogram bucket was negative, which caused problems in the statistics estimation code. This commit fixes the problem by taking into account both `TimeOfDay` and `OffsetSecs` when estimating the size of a bucket in a `TimeTZ` histogram. Fixes #74667 Release note (bug fix): Fixed an internal error, "estimated row count must be non-zero", that could occur during planning for queries over a table with a `TimeTZ` column. This error was due to a faulty assumption in the statistics estimation code about ordering of `TimeTZ` values, which has now been fixed. The error could occur when `TimeTZ` values used in the query had a different time zone offset than the `TimeTZ` values stored in the table. 75112: sql: fix casts between REG* types r=mgartner a=mgartner The newly introduced `castMap` does not contain entries for casts between all combinations of REG* types, which is consistent with Postgres, but inconsistent with behavior in versions up to 21.2 where these casts are allowed. The `castMap` changes result in more than just backward incompatibility. We allow branches of CASE statements to be equivalent types (i.e., types in the same family), like `REGCLASS` and `REGTYPE`, and we automatically add casts to a query plan to support this. However, because these casts don't exist in the `castMap`, internal errors are raised when we try to fetch the volatility of the cast while building logical properties. According to Postgres's type conversion rules for CASE, we should only allow branches to be different types if they can be implicitly cast to the first non-NULL branch. Implicit casts between REG* types are not allowed, so CASE expressions with branches of different REG* types should result in a user error like `CASE/WHEN could not convert type regclass to regtype`. However, this is a much larger project and the change will not be fully backward compatible. This work is tracked by issue #75103. For now, this commit adds casts between REG* types to the `castMap` to maintain backward compatibility and prevent an internal error. There is no release note because this bug does not exist in any releases. Fixes #74784 Release note: None 75119: sql: deflake TestPerfLogging r=rytaft a=rytaft This commit deflakes `TestPerfLogging` by ensuring that test cases that should not produce log entries do not match with unrelated log entries and thus cause the test to fail. This is ensured by making the regex more precise for the specific test case. Fixes #74811 Release note: None 75146: backupccl: "skip" TestChangefeedRestartDuringBackfill.. r=irfansharif a=irfansharif under span configs. This test flakes pretty reliably after span configs were enabled (#73876). Investigating this further is being tracked in \#75080; lets have this test use the old subsystem for now (only down in KV; we've narrowed down the failure to having something to do with concurrent range splits, within the tenant keyspace, while a changefeed is declared). Release note: None Co-authored-by: Ben Bardin <[email protected]> Co-authored-by: Rebecca Taft <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: irfan sharif <[email protected]>
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Jan 19, 2022
Prior to this commit, the DTimeTZ functions Next() and Prev() could skip over valid values according to the ordering of DTimeTZ values in an index (which matches the ordering defined by the TimeTZ functions After() and Before()). This commit fixes these functions so that Next() now returns smallest valid DTimeTZ that is greater than the receiver, and Prev() returns the largest valid DTimeTZ that is less than the receiver. This is an important invariant that the optimizer relies on when building index constraints. Fixes cockroachdb#74912 Release note (bug fix): Fixed a bug that could occur when a TIMETZ column was indexed, and a query predicate constrained that column using a < or > operator with a timetz constant. If the column contained values with time zones that did not match the time zone of the timetz constant, it was possible that not all matching values could be returned by the query. Specifically, the results may not have included values within one microsecond of the predicate's absolute time. This bug was introduced when the timetz datatype was first added in 20.1. It exists on all versions of 20.1, 20.2, 21.1, and 21.2 prior to this patch.
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Jan 19, 2022
Prior to this commit, the DTimeTZ functions Next() and Prev() could skip over valid values according to the ordering of DTimeTZ values in an index (which matches the ordering defined by the TimeTZ functions After() and Before()). This commit fixes these functions so that Next() now returns smallest valid DTimeTZ that is greater than the receiver, and Prev() returns the largest valid DTimeTZ that is less than the receiver. This is an important invariant that the optimizer relies on when building index constraints. Fixes cockroachdb#74912 Release note (bug fix): Fixed a bug that could occur when a TIMETZ column was indexed, and a query predicate constrained that column using a < or > operator with a timetz constant. If the column contained values with time zones that did not match the time zone of the timetz constant, it was possible that not all matching values could be returned by the query. Specifically, the results may not have included values within one microsecond of the predicate's absolute time. This bug was introduced when the timetz datatype was first added in 20.1. It exists on all versions of 20.1, 20.2, 21.1, and 21.2 prior to this patch.
rytaft
added a commit
to rytaft/cockroach
that referenced
this issue
Jan 20, 2022
Prior to this commit, the DTimeTZ functions Next() and Prev() could skip over valid values according to the ordering of DTimeTZ values in an index (which matches the ordering defined by the TimeTZ functions After() and Before()). This commit fixes these functions so that Next() now returns smallest valid DTimeTZ that is greater than the receiver, and Prev() returns the largest valid DTimeTZ that is less than the receiver. This is an important invariant that the optimizer relies on when building index constraints. Fixes cockroachdb#74912 Release note (bug fix): Fixed a bug that could occur when a TIMETZ column was indexed, and a query predicate constrained that column using a < or > operator with a timetz constant. If the column contained values with time zones that did not match the time zone of the timetz constant, it was possible that not all matching values could be returned by the query. Specifically, the results may not have included values within one microsecond of the predicate's absolute time. This bug was introduced when the timetz datatype was first added in 20.1. It exists on all versions of 20.1, 20.2, 21.1, and 21.2 prior to this patch.
gtr
pushed a commit
to gtr/cockroach
that referenced
this issue
Jan 24, 2022
Prior to this commit, the DTimeTZ functions Next() and Prev() could skip over valid values according to the ordering of DTimeTZ values in an index (which matches the ordering defined by the TimeTZ functions After() and Before()). This commit fixes these functions so that Next() now returns smallest valid DTimeTZ that is greater than the receiver, and Prev() returns the largest valid DTimeTZ that is less than the receiver. This is an important invariant that the optimizer relies on when building index constraints. Fixes cockroachdb#74912 Release note (bug fix): Fixed a bug that could occur when a TIMETZ column was indexed, and a query predicate constrained that column using a < or > operator with a timetz constant. If the column contained values with time zones that did not match the time zone of the timetz constant, it was possible that not all matching values could be returned by the query. Specifically, the results may not have included values within one microsecond of the predicate's absolute time. This bug was introduced when the timetz datatype was first added in 20.1. It exists on all versions of 20.1, 20.2, 21.1, and 21.2 prior to this patch.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
S-3-erroneous-edge-case
Database produces or stores erroneous data without visible error/warning, in rare edge cases.
T-sql-queries
SQL Queries Team
Consider the following example. We have two tables with identical data in a timetz column. The only difference is that one table has the timetz column as the primary key (and therefore the column is indexed).
The queries should return the same results on the two different tables, but they are different. This is because the queries over
tab2
are performing a constrained scan of the primary index using an incorrect constraint that misses some values. The reason the constraint is incorrect is because the index constraint code in the optimizer may calldatum.Next()
ordatum.Prev()
to create an inclusive constraint, and theDTimeTZ
implementations ofNext()
andPrev()
are incorrect. These functions return a value that is not the next (or previous) possible value according to the ordering of the key encoded values of TimeTZ.This bug has existed since the timetz datatype was first added in 20.1.
The text was updated successfully, but these errors were encountered: