Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: internal error: apply join found an unexpected re-optimized right-hand-side with non-zero subqueries #35594

Closed
maddyblue opened this issue Mar 11, 2019 · 9 comments
Assignees
Labels
A-sql-optimizer SQL logical planning and optimizations. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-sqlsmith

Comments

@maddyblue
Copy link
Contributor

Setup:

CREATE TABLE t (i INT8, f FLOAT8, d DECIMAL, s STRING, b BYTES, z BOOL);
INSERT INTO t VALUES (1, 2, 3, 's', 'b', true);

SQL:

INSERT
INTO
    defaultdb.public.t AS tab_261 (f, d, z)
SELECT
    (
        SELECT
            1.070774147135095:::FLOAT8 AS col_475
        FROM
            defaultdb.public.t AS tab_263
            JOIN defaultdb.public.t AS tab_264
            ON
                '71.95.110.189/7':::INET
                << (
                        inet_client_addr()
                        & tab_262.i
                            + hostmask(
                                    (
                                        SELECT
                                            '14.246.95.30/18':::INET
                                                AS col_474
                                        FROM
                                            defaultdb.public.t AS tab_265
                                            LEFT JOIN
                                                defaultdb.public.t
                                                    AS tab_266
                                            ON
                                                NULL
                                                >> (
                                                        inet_client_addr()
                                                        + mod(
                                                                tab_262.i,
                                                                721531314898956833:::INT8
                                                            )
                                                        & NULL
                                                        & set_masklen(
                                                                inetrecv(
                                                                    tab_262.d
                                                                )
                                                                - CASE
                                                                    WHEN CASE
                                                                    WHEN tab_262.z
                                                                    THEN (
                                                                        CASE
                                                                        WHEN tab_262.z
                                                                        THEN NULL
                                                                        ELSE '73e2:fa5c:af28:f422:3064:9f92:7ea1:f753/19':::INET
                                                                        END
                                                                        >> CASE
                                                                            WHEN true
                                                                            THEN hostmask(
                                                                                COALESCE(
                                                                                    inetrecv(
                                                                                        3533986633:::OID
                                                                                    ),
                                                                                    NULL
                                                                                )
                                                                            )
                                                                            ELSE CASE
                                                                            WHEN tab_262.z
                                                                            THEN (
                                                                                NULL
                                                                                * 4694440156467956886:::INT8
                                                                                + CASE
                                                                                    WHEN has_server_privilege(
                                                                                        e'\'dY\x1c\x13F':::STRING,
                                                                                        '':::STRING,
                                                                                        'Z':::STRING
                                                                                    )
                                                                                    THEN 'aa0f:81ad:dd37:9c23:af6a:1fcb:1291:84a4/63':::INET
                                                                                    ELSE NULL
                                                                                    END
                                                                            )
                                                                            ELSE '63.159.112.159/9':::INET
                                                                            END
                                                                            END
                                                                    )
                                                                    ELSE true
                                                                    END
                                                                    THEN 3505075415403534432:::INT8
                                                                    ELSE (-225433402022489121):::INT8
                                                                    END,
                                                                (-3403353291897039391):::INT8
                                                            )
                                                    )
                                        ORDER BY
                                            tab_265.f ASC,
                                            tab_265.d DESC,
                                            tab_266.z DESC,
                                            tab_265.d ASC
                                        LIMIT
                                            1:::INT8
                                    )
                                )
                    )
                    - (-9007439590206391659):::INT8
                    + 4658748795740290371:::INT8
        WHERE
            true
        LIMIT
            1:::INT8
    )
        AS col_476,
    (-68269750.0197999559):::DECIMAL AS col_477,
    false AS col_478
FROM
    defaultdb.public.t AS tab_262;

Error:

pq: internal error: apply join found an unexpected re-optimized right-hand-side with non-zero subqueries:
project
 ├── columns: col_475:44(float!null)
 ├── cardinality: [0 - 1]
 ├── stats: [rows=0.111111111]
 ├── cost: 4.56944444
 ├── key: ()
 ├── fd: ()-->(44)
 ├── prune: (44)
 ├── limit
 │    ├── cardinality: [0 - 1]
 │    ├── stats: [rows=0.111111111]
 │    ├── cost: 4.55722222
 │    ├── key: ()
 │    ├── inner-join
 │    │    ├── stats: [rows=0.111111111]
 │    │    ├── cost: 4.54611111
 │    │    ├── select
 │    │    │    ├── stats: [rows=0.333333333]
 │    │    │    ├── cost: 2.2625
 │    │    │    ├── scan tab_263
 │    │    │    │    ├── stats: [rows=1]
 │    │    │    │    └── cost: 1.08
 │    │    │    └── filters
 │    │    │         └── l-shift [type=bool]
 │    │    │              ├── const: '71.95.110.189/7' [type=inet]
 │    │    │              └── plus [type=inet]
 │    │    │                   ├── minus [type=inet]
 │    │    │                   │    ├── bitand [type=inet]
 │    │    │                   │    │    ├── function: inet_client_addr [type=inet]
 │    │    │                   │    │    └── plus [type=inet]
 │    │    │                   │    │         ├── function: hostmask [type=inet]
 │    │    │                   │    │         │    └── subquery [type=inet]
 │    │    │                   │    │         │         └── project
 │    │    │                   │    │         │              ├── columns: col_474:43(inet!null)
 │    │    │                   │    │         │              ├── cardinality: [0 - 1]
 │    │    │                   │    │         │              ├── stats: [rows=0]
 │    │    │                   │    │         │              ├── cost: 1.1625
 │    │    │                   │    │         │              ├── key: ()
 │    │    │                   │    │         │              ├── fd: ()-->(43)
 │    │    │                   │    │         │              ├── limit
 │    │    │                   │    │         │              │    ├── columns: tab_265.f:30(float) tab_265.d:31(decimal) tab_266.z:41(bool)
 │    │    │                   │    │         │              │    ├── internal-ordering: +30,-31,-41
 │    │    │                   │    │         │              │    ├── cardinality: [0 - 1]
 │    │    │                   │    │         │              │    ├── stats: [rows=0]
 │    │    │                   │    │         │              │    ├── cost: 1.1525
 │    │    │                   │    │         │              │    ├── key: ()
 │    │    │                   │    │         │              │    ├── fd: ()-->(30,31,41)
 │    │    │                   │    │         │              │    ├── sort
 │    │    │                   │    │         │              │    │    ├── columns: tab_265.f:30(float) tab_265.d:31(decimal) tab_266.z:41(bool)
 │    │    │                   │    │         │              │    │    ├── stats: [rows=0]
 │    │    │                   │    │         │              │    │    ├── cost: 1.1425
 │    │    │                   │    │         │              │    │    ├── fd: ()~~>(41)
 │    │    │                   │    │         │              │    │    ├── ordering: +30,-31,-41
 │    │    │                   │    │         │              │    │    ├── prune: (30,31,41)
 │    │    │                   │    │         │              │    │    └── left-join
 │    │    │                   │    │         │              │    │         ├── columns: tab_265.f:30(float) tab_265.d:31(decimal) tab_266.z:41(bool)
 │    │    │                   │    │         │              │    │         ├── stats: [rows=0]
 │    │    │                   │    │         │              │    │         ├── cost: 1.1325
 │    │    │                   │    │         │              │    │         ├── fd: ()~~>(41)
 │    │    │                   │    │         │              │    │         ├── prune: (30,31,41)
 │    │    │                   │    │         │              │    │         ├── scan tab_265
 │    │    │                   │    │         │              │    │         │    ├── columns: tab_265.f:30(float) tab_265.d:31(decimal)
 │    │    │                   │    │         │              │    │         │    ├── stats: [rows=1]
 │    │    │                   │    │         │              │    │         │    ├── cost: 1.1
 │    │    │                   │    │         │              │    │         │    └── prune: (30,31)
 │    │    │                   │    │         │              │    │         ├── values
 │    │    │                   │    │         │              │    │         │    ├── columns: tab_266.z:41(bool)
 │    │    │                   │    │         │              │    │         │    ├── cardinality: [0 - 0]
 │    │    │                   │    │         │              │    │         │    ├── stats: [rows=0]
 │    │    │                   │    │         │              │    │         │    ├── cost: 0.01
 │    │    │                   │    │         │              │    │         │    ├── key: ()
 │    │    │                   │    │         │              │    │         │    ├── fd: ()-->(41)
 │    │    │                   │    │         │              │    │         │    └── prune: (41)
 │    │    │                   │    │         │              │    │         └── filters (true)
 │    │    │                   │    │         │              │    └── const: 1 [type=int]
 │    │    │                   │    │         │              └── projections
 │    │    │                   │    │         │                   └── const: '14.246.95.30/18' [type=inet]
 │    │    │                   │    │         └── const: 1 [type=int]
 │    │    │                   │    └── const: -9007439590206391659 [type=int]
 │    │    │                   └── const: 4658748795740290371 [type=int]
 │    │    ├── select
 │    │    │    ├── stats: [rows=0.333333333]
 │    │    │    ├── cost: 2.2625
 │    │    │    ├── scan tab_264
 │    │    │    │    ├── stats: [rows=1]
 │    │    │    │    └── cost: 1.08
 │    │    │    └── filters
 │    │    │         └── l-shift [type=bool]
 │    │    │              ├── const: '71.95.110.189/7' [type=inet]
 │    │    │              └── plus [type=inet]
 │    │    │                   ├── minus [type=inet]
 │    │    │                   │    ├── bitand [type=inet]
 │    │    │                   │    │    ├── function: inet_client_addr [type=inet]
 │    │    │                   │    │    └── plus [type=inet]
 │    │    │                   │    │         ├── function: hostmask [type=inet]
 │    │    │                   │    │         │    └── subquery [type=inet]
 │    │    │                   │    │         │         └── project
 │    │    │                   │    │         │              ├── columns: col_474:43(inet!null)
 │    │    │                   │    │         │              ├── cardinality: [0 - 1]
 │    │    │                   │    │         │              ├── stats: [rows=0]
 │    │    │                   │    │         │              ├── cost: 1.1625
 │    │    │                   │    │         │              ├── key: ()
 │    │    │                   │    │         │              ├── fd: ()-->(43)
 │    │    │                   │    │         │              ├── limit
 │    │    │                   │    │         │              │    ├── columns: tab_265.f:30(float) tab_265.d:31(decimal) tab_266.z:41(bool)
 │    │    │                   │    │         │              │    ├── internal-ordering: +30,-31,-41
 │    │    │                   │    │         │              │    ├── cardinality: [0 - 1]
 │    │    │                   │    │         │              │    ├── stats: [rows=0]
 │    │    │                   │    │         │              │    ├── cost: 1.1525
 │    │    │                   │    │         │              │    ├── key: ()
 │    │    │                   │    │         │              │    ├── fd: ()-->(30,31,41)
 │    │    │                   │    │         │              │    ├── sort
 │    │    │                   │    │         │              │    │    ├── columns: tab_265.f:30(float) tab_265.d:31(decimal) tab_266.z:41(bool)
 │    │    │                   │    │         │              │    │    ├── stats: [rows=0]
 │    │    │                   │    │         │              │    │    ├── cost: 1.1425
 │    │    │                   │    │         │              │    │    ├── fd: ()~~>(41)
 │    │    │                   │    │         │              │    │    ├── ordering: +30,-31,-41
 │    │    │                   │    │         │              │    │    ├── prune: (30,31,41)
 │    │    │                   │    │         │              │    │    └── left-join
 │    │    │                   │    │         │              │    │         ├── columns: tab_265.f:30(float) tab_265.d:31(decimal) tab_266.z:41(bool)
 │    │    │                   │    │         │              │    │         ├── stats: [rows=0]
 │    │    │                   │    │         │              │    │         ├── cost: 1.1325
 │    │    │                   │    │         │              │    │         ├── fd: ()~~>(41)
 │    │    │                   │    │         │              │    │         ├── prune: (30,31,41)
 │    │    │                   │    │         │              │    │         ├── scan tab_265
 │    │    │                   │    │         │              │    │         │    ├── columns: tab_265.f:30(float) tab_265.d:31(decimal)
 │    │    │                   │    │         │              │    │         │    ├── stats: [rows=1]
 │    │    │                   │    │         │              │    │         │    ├── cost: 1.1
 │    │    │                   │    │         │              │    │         │    └── prune: (30,31)
 │    │    │                   │    │         │              │    │         ├── values
 │    │    │                   │    │         │              │    │         │    ├── columns: tab_266.z:41(bool)
 │    │    │                   │    │         │              │    │         │    ├── cardinality: [0 - 0]
 │    │    │                   │    │         │              │    │         │    ├── stats: [rows=0]
 │    │    │                   │    │         │              │    │         │    ├── cost: 0.01
 │    │    │                   │    │         │              │    │         │    ├── key: ()
 │    │    │                   │    │         │              │    │         │    ├── fd: ()-->(41)
 │    │    │                   │    │         │              │    │         │    └── prune: (41)
 │    │    │                   │    │         │              │    │         └── filters (true)
 │    │    │                   │    │         │              │    └── const: 1 [type=int]
 │    │    │                   │    │         │              └── projections
 │    │    │                   │    │         │                   └── const: '14.246.95.30/18' [type=inet]
 │    │    │                   │    │         └── const: 1 [type=int]
 │    │    │                   │    └── const: -9007439590206391659 [type=int]
 │    │    │                   └── const: 4658748795740290371 [type=int]
 │    │    └── filters (true)
 │    └── const: 1 [type=int]
 └── projections
      └── const: 1.070774147135095 [type=float]

DETAIL: apply_join.go:274,plan_node_to_row_source.go:178,row_source_to_plan_node.go:78,insert.go:471,plan_batch.go:173,plan.go:496,walk.go:145,walk.go:615,walk.go:112,walk.go:76,plan.go:499,plan_node_to_row_source.go:124,processors.go:800,flow.go:624,distsql_running.go:252,distsql_running.go:839,conn_executor_exec.go:1100,conn_executor_exec.go:945,conn_executor_exec.go:457,conn_executor_exec.go:103,conn_executor.go:1178,conn_executor.go:429,conn.go:332
HINT: You have encountered an unexpected error inside CockroachDB.

Found with sqlsmith.

@maddyblue maddyblue added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Mar 11, 2019
@knz knz added the A-sql-optimizer SQL logical planning and optimizations. label Mar 11, 2019
@andy-kimball
Copy link
Contributor

@jordanlewis, will you take first look at this?

@jordanlewis
Copy link
Member

Yes, I already started to take a look this morning. Justin poked over and noticed that it's weird that the produced plan has 2 exact duplicate groups. I haven't discovered anything else yet.

Fundamentally though, the apply join code expects that there will not be any RHS expressions with actual subqueries - if that assumption is violated then I think it means that the optimizer is doing something unexpected up front.

@andy-kimball
Copy link
Contributor

But the optimizer never hoists uncorrelated subqueries; only correlated subqueries.

@jordanlewis
Copy link
Member

Right, but I thought that all uncorrelated subqueries (no matter how deep within the query tree) get transformed during the very first execbuild phase into top-level exec subqueries. If that were the case, then we wouldn't see any more top-level exec subqueries during the re-optimization phases.

@justinj
Copy link
Contributor

justinj commented Mar 11, 2019

I think Jordan is right that this isn't an execution thing—the fact that there's two identical groups (not just semantically, even the column ids are the same) indicates something has gone wrong in the optimizer. I suspect there's an incorrect transformation rule somewhere. I can investigate.

@justinj
Copy link
Contributor

justinj commented Mar 11, 2019

Smaller repro:

SELECT
	(SELECT * FROM (VALUES ((SELECT x FROM (VALUES (1)) AS s (x)) + y)))
FROM
	(VALUES (1), (2), (3)) AS t (y)

Jordan and I talked it over and we think this actually happens whenever the RHS of an apply join has an uncorrelated subquery. Subsequent execbuilds for the apply-join case need to know not to re-raise the subquery to the top-level and instead return the prior reference.

@andy-kimball
Copy link
Contributor

@jordanlewis and I talked about this some more today. I think for 19.1 it's OK to re-execute any uncorrelated subquery on each iteration of the apply loop, rather than trying to execute once. In a later release, we will most likely have the optimizer hoist uncorrelated subqueries into top-level With clauses, which would avoid this issue altogether.

Jordan, have you thought more about this? Would it be relatively easy to just re-execute each time, or are there other problems with that?

@jordanlewis
Copy link
Member

It'll be easy to re-execute each time. I went down a rabbit hole of trying to make the top level subqueries just get executed once. It was almost easy, but not quite. The solution I wanted to do was have a unique ID associated with each subquery that's passed along during the CopyAndReplace step, so that when we see a subquery during execbuild, we can replace it with the old subquery reference based on that unique id instead of execbuilding a new one.

But that was a little bit challenging because of this unique id generation stuff. I worked with Justin on it a bit and gave up. He said that it'll be easy to solve this particular issue when with support is available in opt. So once that's done, we'll be able to more easily remove the duplicate subquery evaluation - but for now we'll just do it on every row.

@andy-kimball
Copy link
Contributor

Sounds right to me.

craig bot pushed a commit that referenced this issue Mar 15, 2019
35711: sql: permit subqueries within RHS of applyjoin r=jordanlewis a=jordanlewis

Previously, the code assumed that subqueries would have already been
promoted to top-level subqueries and run exactly once by the
re-optimization phase of apply join. However, that promotion happens in
execbuild, not in the optimizer, and therefore won't have already
happened by the time that we go to execbuild the re-optimized RHS in
apply join.

For now, the solution is to re-run the subqueries in the RHS every time
the RHS is run. This is suboptimal because subqueries need only be run
once per query regardless of their position within the apply join tree.
However, setting this up currently is difficult at the moment and would
have required a more invasive change.

When WITH support is available, the optimizer will promote subqueries
into WITH clauses up front, at which point we will be able to safely
remove the requirement that apply join rerun subqueries on the RHS.

Closes #35594.

Release note: None

Co-authored-by: Jordan Lewis <[email protected]>
@craig craig bot closed this as completed in #35711 Mar 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-optimizer SQL logical planning and optimizations. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-sqlsmith
Projects
None yet
Development

No branches or pull requests

5 participants