-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: top-level relational expression cannot have outer columns #58438
Comments
This was me! It's easy to reproduce:
|
I originally thought that this was a bug in ConvertZipArraysToValues but I think optbuilder is incorrectly building the project-set expression:
The project-set shouldn't have |
Using generate_subscripts as a foil (ie a srf where this syntax seems to work) I tracked this down to project.constructProject, scalar is non-nil so we call b.factory.ConstructProjectionsItem(scalar, id) and that's where things go sideways. If I force it down the passthrough path it seems to work. Working backwards from there I tracked things to Postgresql does this for this query:
So the unnest and array cancel each other out it seems. So I think the project-set columns are correct I just think we're lacking some machinery to link them up to what comes out of the unnest. I'm not familiar enough with the code to know where we are going astray but definitely seems like srf.finishBuildGeneratorFunction isn't tying things together appropriately for non-single column inputs to unnest. |
The following diff fixes the problem and makes our behavior match postgres but I'm not sure about it:
Actually I'm sure its wrong because it breaks a ton of srfs logictests. |
I'm coming back to the original idea that the problem is with
If I comment out the panic that causes the error and run
Notice that |
I'm the one who wrote that rule, so I feel a bit of responsibility for helping out. Removing the rule entirely and then executing this query:
Gives this error: |
Hm, it seems like the optbuilder code thinks the array elements are integers, when they are really tuples. |
Good find, @DrewKimball -- so maybe @mgartner's original thought that the |
I think this goes all the way down to the parser, fun. |
Or maybe not |
Just adding what @rytaft and I talked about on slack, ConstructValuesFromZip is hard coded to assume 1 col in the zip So one fix is to disable the rule like so:
I still need to digest Drew's comments... |
I think the unnest and json functions always return one row, but checking that might be a good idea anyway |
*Column, not row |
Actually, scratch what I said earlier. The optbuilder is getting the correct element type, I forgot to remove an old query from my test :/ |
@cucaroach you picked a pretty thorny 'quick win'... I'm not really sure how you'd go about fixing this. |
No kidding! I think the confusion may be that unnest has different semantics when it appears in the select clause vs the from clause. These examples from the postgres docs are helpful:
Vs
I'm thinking we need to treat these two cases differently? Actually I take that back ... I think unnest always returns a scalar but when in the from clause, the args to unnest materialize as columns, more examples:
|
I think the difference there is because there are two arrays in the unnest vs an array with arrays inside it.
I wonder if this is intentional or not? |
Here's an example with the nested arrays in the from clause:
|
Apparently postgres will fully unpack nested arrays but not tuples.
|
I have to end this lovely diversion for now but I pushed what I think are good but incomplete fixes to unnest-panic-bug branch. I think a good next step would be to audit the relevant tests in srfs to see which do and don't agree with postgres. Feel free to take this bug and run with it @DrewKimball! Otherwise I'll return to it on Friday. |
I think the bug that lies at the root of all this is that we represent a function that returns multiple columns the same as a function that returns a single tuple column with the same internal types; e.g. the return type for a multiple-column function is a tuple with labels. This causes the cockroach/pkg/sql/opt/optbuilder/srfs.go Lines 152 to 157 in 9e44f56
I don't think the current way we handle types is powerful enough to describe multiple columns (hence why we use tuples - we really want composite types instead). We might have to solve that properly for UDFs anyway, but in the meantime I think the best way to solve it might be to change the signature of
|
…85329 84975: storage: add `MVCCRangeKeyStack` for range keys r=nicktrav,jbowens a=erikgrinaker **storage: add `MVCCRangeKeyStack` for range keys** This patch adds `MVCCRangeKeyStack` and `MVCCRangeKeyVersion`, a new range key representation that will be returned by `SimpleMVCCIterator`. It is more compact, for efficiency, and comes with a set of convenience methods to simplify common range key processing. Resolves #83895. Release note: None **storage: return `MVCCRangeKeyStack` from `SimpleMVCCIterator`** This patch changes `SimpleMVCCIterator.RangeKeys()` to return `MVCCRangeKeyStack` instead of `[]MVCCRangeKeyValue`. Callers have not been migrated to properly make use of this -- instead, they call `AsRangeKeyValues()` and construct and use the old data structure. The MVCC range tombstones tech note is also updated to reflect this. Release note: None **storage: migrate MVCC code to `MVCCRangeKeyStack`** Release note: None ***: migrate higher-level code to `MVCCRangeKeyStack`** Release note: None **kvserver/gc: partially migrate to `MVCCRangeKeyStack`** Some parts require invasive changes to MVCC stats helpers. These will shortly be consolidated with other MVCC stats logic elsewhere, so the existing logic is retained for now by using `AsRangeKeyValues()`. Release note: None **storage: remove `FirstRangeKeyAbove()` and `HasRangeKeyBetween()`** Release note: None 85017: Revert "sql: Add database ID to sampled query log" r=THardy98 a=THardy98 Reverts: #84195 This reverts commit 307817e. Removes the DatabaseID field from the `SampledQuery` telemetry log due to the potential of indefinite blocking in the case of a lease acquisition failure. Protobuf field not reserved as no official build was released with these changes yet. Release note (sql change): Removes the DatabaseID field from the `SampledQuery` telemetry log due to the potential of indefinite blocking in the case of a lease acquisition failure. 85024: cloud/gcp: add custom retryer for gcs storage, retry on stream INTERNAL_ERROR r=rhu713 a=rhu713 Currently, errors like `stream error: stream ID <x>; INTERNAL_ERROR; received from peer` are not being retried. Create a custom retryer to retry these errors as suggested by: googleapis/google-cloud-go#3735 googleapis/google-cloud-go#784 Fixes: #85217, #85216, #85204, #84162 Release note: None 85069: optbuilder: handle unnest returning a tuple r=DrewKimball a=DrewKimball Currently, the return types of SRFs that return multiple columns are represented as tuples with labels. The tuple labels are used to decide whether or not to create a single output column for the SRF, or multiple. The `unnest` function can return a single column if it has a single argument, and the type of that column can be a tuple with labels. This could cause the old logic to mistakenly create multiple output columns for `unnest`, which could lead to panics down the line and incorrect behavior otherwise. This commit adds a special case for `unnest` in the `optbuilder` to only expand tuple return types if there is more than one argument (implying more than one output column). Other SRFs do not have the same problem because they either always return the same number of columns, cannot return tuples, or both. Fixes #58438 Release note (bug fix): Fixed a bug existing since release 20.1 that could cause a panic in rare cases when the unnest function was used with a tuple return type. 85100: opt: perf improvements for large queries r=DrewKimball a=DrewKimball **opt: add bench test for slow queries** This commit adds two slow-planning queries pulled from #64793 to be used in benchmarking the optimizer. In addition, the `ReorderJoinsLimit` has been set to the default 8 for benchmarking tests. **opt: add struct for tracking column equivalence sets** Previously, the `JoinOrderBuilder` would construct a `FuncDepSet` from scratch on each call to `addJoins` in order to eliminate redundant join filters. This led to unnecessary large allocations because `addJoins` is called an exponential number of times in query size. This commit adds a struct `EquivSet` that efficiently stores equivalence relations as `ColSets` in a slice. Rather than being constructed on each call to `addJoins`, a `Reset` method is called that maintains slice memory. In the future, `EquivSet` can be used to handle equivalencies within `FuncDepSet` structs as well. This well avoid a significant number of allocations in cases with many equivalent columns, as outlined in #83963. **opt: avoid usage of FastIntMap in optimizer hot paths** Previously, `computeHashJoinCost` would use a `FastIntMap` to represent join equality filters to pass to `computeFiltersCost`. In addition, `GenerateMergeJoins` used a `FastIntMap` to look up columns among its join equality columns. This lead to unnecessary allocations since column IDs are often large enough to exceed the small field of `FastIntMap`. This commit modifies `computeFiltersCost` to take an anonymous function that is used to decide whether to skip an equality condition, removing the need for a mapping between columns. This commit also refactors `GenerateMergeJoins` to simply perform a linear scan of its equality columns; this avoids the allocation issue, and should be fast in practice because the number of equalities will not generally be large. Release note: None 85146: [backupccl] Use Expr for backup's Detached and Revision History options r=benbardin a=benbardin This will allow us to set them to null, which will be helpful for ALTER commands. Release note: None 85234: dev: add rewritable paths for norm tests r=mgartner a=mgartner Tests in `pkg/sql/opt/norm` are similar to tests in `pkg/sql/opt/xform` and `pkg/sql/opt/memo` in that they rely on fixtures in `pkg/sql/opt/testutils/opttester/testfixtures`. This commit adds these fixtures as rewritable paths for norm tests so that `./dev test pkg/sql/opt/xform --rewrite` does not fail with errors like: open pkg/sql/opt/testutils/opttester/testfixtures/tpcc_schema: operation not permitted Release note: None 85325: sql: fix explain gist output to show number of scan span constraints r=cucaroach a=cucaroach If there were span constraints we would always print 1, need to actually append them to get the count right. Fixes: #85324 Release note: None 85327: sql: fix udf logic test r=chengxiong-ruan a=chengxiong-ruan Fixes: #85303 Release note: None 85329: colexec: fix recent concat fix r=yuzefovich a=yuzefovich The recent fix of the Concat operator in the vectorized engine doesn't handle the array concatenation correctly and this is now fixed. Fixes: #85295. Release note: None Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Thomas Hardy <[email protected]> Co-authored-by: Rui Hu <[email protected]> Co-authored-by: DrewKimball <[email protected]> Co-authored-by: Andrew Kimball <[email protected]> Co-authored-by: Ben Bardin <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Tommy Reilly <[email protected]> Co-authored-by: Chengxiong Ruan <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
Currently, the return types of SRFs that return multiple columns are represented as tuples with labels. The tuple labels are used to decide whether or not to create a single output column for the SRF, or multiple. The `unnest` function can return a single column if it has a single argument, and the type of that column can be a tuple with labels. This could cause the old logic to mistakenly create multiple output columns for `unnest`, which could lead to panics down the line and incorrect behavior otherwise. This commit adds a special case for `unnest` in the `optbuilder` to only expand tuple return types if there is more than one argument (implying more than one output column). Other SRFs do not have the same problem because they either always return the same number of columns, cannot return tuples, or both. Fixes #58438 Release note (bug fix): Fixed a bug existing since release 20.1 that could cause a panic in rare cases when the unnest function was used with a tuple return type.
Currently, the return types of SRFs that return multiple columns are represented as tuples with labels. The tuple labels are used to decide whether or not to create a single output column for the SRF, or multiple. The `unnest` function can return a single column if it has a single argument, and the type of that column can be a tuple with labels. This could cause the old logic to mistakenly create multiple output columns for `unnest`, which could lead to panics down the line and incorrect behavior otherwise. This commit adds a special case for `unnest` in the `optbuilder` to only expand tuple return types if there is more than one argument (implying more than one output column). Other SRFs do not have the same problem because they either always return the same number of columns, cannot return tuples, or both. Fixes #58438 Release note (bug fix): Fixed a bug existing since release 20.1 that could cause a panic in rare cases when the unnest function was used with a tuple return type.
This issue was autofiled by Sentry. It represents a crash or reported error on a live cluster with telemetry enabled.
Sentry link: https://sentry.io/organizations/cockroach-labs/issues/2131023460/?referrer=webhooks_plugin
Panic message:
Stacktrace (expand for inline code snippets):
cockroach/pkg/sql/opt/xform/optimizer.go
Lines 233 to 235 in 92d9495
cockroach/pkg/sql/plan_opt.go
Lines 515 to 517 in 92d9495
cockroach/pkg/sql/plan_opt.go
Lines 194 to 196 in 92d9495
cockroach/pkg/sql/conn_executor_exec.go
Lines 900 to 902 in 92d9495
cockroach/pkg/sql/conn_executor_exec.go
Lines 779 to 781 in 92d9495
cockroach/pkg/sql/conn_executor_exec.go
Lines 638 to 640 in 92d9495
cockroach/pkg/sql/conn_executor_exec.go
Lines 113 to 115 in 92d9495
cockroach/pkg/sql/conn_executor.go
Lines 1464 to 1466 in 92d9495
cockroach/pkg/sql/conn_executor.go
Lines 1466 to 1468 in 92d9495
cockroach/pkg/sql/conn_executor.go
Lines 1390 to 1392 in 92d9495
cockroach/pkg/sql/conn_executor.go
Lines 507 to 509 in 92d9495
cockroach/pkg/sql/pgwire/conn.go
Lines 625 to 627 in 92d9495
/usr/local/go/src/runtime/asm_amd64.s#L1356-L1358 in runtime.goexit
v20.2.2
Jira issue: CRDB-3393
The text was updated successfully, but these errors were encountered: