Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/tests: TestRandomSyntaxSQLSmith failed [inner-join RelExpr children have intersecting columns] #114703

Closed
cockroach-teamcity opened this issue Nov 19, 2023 · 4 comments · Fixed by #115142
Assignees
Labels
branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 C-test-failure Broken test (automatically or manually discovered). GA-blocker O-robot Originated from a bot. O-rsg Random Syntax Generator T-sql-queries SQL Queries Team
Milestone

Comments

@cockroach-teamcity
Copy link
Member

cockroach-teamcity commented Nov 19, 2023

sql/tests.TestRandomSyntaxSQLSmith failed with artifacts on release-23.2 @ c9f6f30496fe34af8d17410e5307f359458aaf52:

Random syntax error:

    rsg_test.go:911: Crash detected: server panic: pq: internal error: inner-join RelExpr children have intersecting columns: (42,50,63,64)

Query:

        SELECT
        	'1976-11-23':::DATE AS " col885550",
        	var_pop(tab391036.crdb_internal_mvcc_timestamp::DECIMAL)::DECIMAL AS col885551,
        	stddev((-0.07257373532581902993):::DECIMAL::INT8)::DECIMAL AS col͇885552,
        	tab391036."co l1_0" AS col885553
        FROM
        	defaultdb.public.seed@[0] AS "%qtab391035"
        	JOIN defaultdb.public."tab'le1"@"tab'le1_expr_idx" AS tab391036 ON
        			("%qtab391035".tableoid) = (tab391036.tableoid)
        			AND ("%qtab391035".crdb_internal_mvcc_timestamp) = (tab391036.crdb_internal_mvcc_timestamp)
        WHERE
        	(SELECT 884958838:::OID AS " col885549" FROM defaultdb.public."tab'le1"@"tab'le1_expr_co l1_0_idx" AS "_�tab391040", defaultdb.public.seed@[0] AS "ta""b391041" ORDER BY "ta""b391041"._int2 DESC NULLS FIRST, "ta""b391041"._interval NULLS FIRST LIMIT 1:::INT8)::OID
        	IN (
        			WITH
        				with159481 (col885545)
        					AS (
        						SELECT * FROM (VALUES (e'\U00074537' COLLATE de_DE), (NULL)) AS tab391037 (col885545)
        						INTERSECT ALL
        							SELECT
        								*
        							FROM
        								(
        									VALUES
        										(text((-876764555.4284667560):::DECIMAL::DECIMAL)::STRING COLLATE de_DE),
        										(e'\U0008831A' COLLATE de_DE),
        										(e'\U0005AFE8\U0008DCBA\U000D391A' COLLATE de_DE)
        								)
        									AS tab391038 (col885546)
        					),
        				with159482 ("coL885547")
        					AS (
        						SELECT * FROM (VALUES ('1988-09-26':::DATE), (NULL), (NULL)) AS "tab\\uDC40391039" ("coL885547")
        					)
        			SELECT
        				tab391036.tableoid::OID AS c̃ol̼😜885548
        		)
        GROUP BY
        	"%qtab391035"._decimal, "%qtab391035"._date, tab391036.crdb_internal_mvcc_timestamp, tab391036."co l1_0"
        HAVING
        	bool_and("%qtab391035"._bool::BOOL)::BOOL
        ORDER BY
        	"%qtab391035"._decimal ASC, "%qtab391035"._date DESC NULLS FIRST
        LIMIT
        	26:::INT8;

Schema:

    rsg_test.go:720: To reproduce, use schema:
    rsg_test.go:722: SET sql_safe_updates = false;;
    rsg_test.go:722: SET CLUSTER SETTING sql.stats.automatic_collection.enabled = false;;
    rsg_test.go:722: SET CLUSTER SETTING sql.stats.histogram_collection.enabled = false;;
    rsg_test.go:722: CREATE TABLE "tab'le1" ("co l1_0" NAME, col1_1 TIMETZ NOT NULL, "c ol1_2" INT2 NOT NULL, col1_3 STRING NOT NULL AS (lower("co l1_0")) VIRTUAL, PRIMARY KEY (col1_3 ASC, "c ol1_2" ASC), INDEX (("c ol1_2" + 17774:::INT8) ASC, "co l1_0" DESC), UNIQUE (col1_1 DESC, lower(CAST(col1_1 AS STRING)) ASC, col1_3) WHERE "tab'le1"."co l1_0" < '':::STRING:::NAME, INDEX (col1_1 ASC, ("c ol1_2" + (-26953):::INT8), col1_3 ASC) WHERE (("tab'le1".col1_3 != 'X':::STRING) AND ("tab'le1"."c ol1_2" = 1:::INT8)) OR ("tab'le1".col1_1 < '00:00:00+15:59':::TIMETZ) VISIBILITY 0.30, INDEX (lower("co l1_0") DESC) STORING (col1_1) WHERE ((("tab'le1".col1_1 != '24:00:00-15:59':::TIMETZ) OR ("tab'le1"."co l1_0" <= 'X':::STRING:::NAME)) OR ("tab'le1".col1_3 < e'\U00002603':::STRING)) OR ("tab'le1"."c ol1_2" >= (-32768):::INT8), UNIQUE ("c ol1_2" DESC, "co l1_0" ASC) WHERE ("tab'le1".col1_1 > '24:00:00-15:59':::TIMETZ) AND ("tab'le1"."co l1_0" = e'\x00':::STRING:::NAME), UNIQUE ("c ol1_2" ASC, col1_3 ASC, col1_1 ASC), FAMILY ("c ol1_2", col1_1, "co l1_0"));
    rsg_test.go:722: ALTER TABLE "tab'le1" INJECT STATISTICS e'[{"avg_size": 23, "columns": ["\\"co l1_0\\""], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 881177514894877516, "histo_col_type": "", "name": "__auto__", "null_count": 1382871461899841653, "row_count": 2482386356606809074}, {"avg_size": 25, "columns": ["col1_1"], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 1325107802613261019, "histo_buckets": [{"distinct_range": 0, "num_eq": 0, "num_range": 0, "upper_bound": "09:32:44.209818+06:54"}, {"distinct_range": 455092794166317200, "num_eq": 300, "num_range": 4025135702079162861, "upper_bound": "05:28:20.158276+00:32"}, {"distinct_range": 0, "num_eq": 3015553724450640402, "num_range": 7561892057817281519, "upper_bound": "18:44:36.635038+12:41"}, {"distinct_range": 2303590125519348500, "num_eq": 5664224463233182992, "num_range": 2303590125519348549, "upper_bound": "06:49:58.228979+00:11"}, {"distinct_range": 3000000000, "num_eq": 3000000000, "num_range": 3000000000, "upper_bound": "22:15:53.024518-07:15"}, {"distinct_range": 0, "num_eq": 523283071033843517, "num_range": 9019577961766724534, "upper_bound": "24:00:00-15:59"}], "histo_col_type": "TIMETZ", "name": "__auto__", "null_count": 0, "row_count": 2482386356606809074}, {"avg_size": 12, "columns": ["\\"c ol1_2\\""], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 703602587645114629, "histo_col_type": "INT2", "name": "__auto__", "null_count": 0, "row_count": 2482386356606809074}, {"avg_size": 7, "columns": ["col1_3"], "created_at": "2000-01-01 00:00:00+00:00", "distinct_count": 325547840088071858, "histo_buckets": [{"distinct_range": 0, "num_eq": 4794843170231017593, "num_range": 0, "upper_bound": ""}, {"distinct_range": 458.1811842957447, "num_eq": 1333789801729865774, "num_range": 600, "upper_bound": "\\u0000"}, {"distinct_range": 25146.153969309224, "num_eq": 2792342344167939272, "num_range": 40000, "upper_bound": "XN"}], "histo_col_type": "STRING", "name": "__auto__", "null_count": 0, "row_count": 2482386356606809074}]':::JSONB;
    rsg_test.go:722: CREATE TYPE rand_typ_0 AS ENUM ('bjsaf', 'xxbcdn', 'mobyq', 'b', 'zvzwzh');
    rsg_test.go:722: CREATE TYPE rand_typ_1 AS ENUM ('w', 'ahy', 'lnc');
    rsg_test.go:722: SET sql_safe_updates = false;;
    rsg_test.go:722: 
        BEGIN; CREATE TYPE greeting AS ENUM ('hello', 'howdy', 'hi', 'good day', 'morning'); COMMIT;
        BEGIN;
        CREATE TABLE IF NOT EXISTS seed AS
        	SELECT
        		g::INT2 AS _int2,
        		g::INT4 AS _int4,
        		g::INT8 AS _int8,
        		g::FLOAT4 AS _float4,
        		g::FLOAT8 AS _float8,
        		'2001-01-01'::DATE + g AS _date,
        		'2001-01-01'::TIMESTAMP + g * '1 day'::INTERVAL AS _timestamp,
        		'2001-01-01'::TIMESTAMPTZ + g * '1 day'::INTERVAL AS _timestamptz,
        		g * '1 day'::INTERVAL AS _interval,
        		g % 2 = 1 AS _bool,
        		g::DECIMAL AS _decimal,
        		g::STRING AS _string,
        		g::STRING::BYTES AS _bytes,
        		substring('00000000-0000-0000-0000-' || g::STRING || '00000000000', 1, 36)::UUID AS _uuid,
        		'0.0.0.0'::INET + g AS _inet,
        		g::STRING::JSONB AS _jsonb,
        		enum_range('hello'::greeting)[g] as _enum
        	FROM
        		generate_series(1, 5) AS g;
        COMMIT;
        
        INSERT INTO seed DEFAULT VALUES;
        CREATE INDEX on seed (_int8, _float8, _date);
        CREATE INVERTED INDEX on seed (_jsonb);
        ;
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (8594355245580624359);
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (7814313793381395637);
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (452728068839437155);
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (8289981896376721186);
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (1102571324373106807);
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (9120260909951863919);
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (1368567861377401943);
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (505761125082940084);
    rsg_test.go:722: ALTER TABLE seed SPLIT AT VALUES (3764695915680452516);
    rsg_test.go:722: ALTER TABLE seed SCATTER;
    rsg_test.go:724: 
    rsg_test.go:725: -- test log scope end --
test logs left over in: /artifacts/tmp/_tmp/d437d2c847dfedbc4972f231c3331c8e/logTestRandomSyntaxSQLSmith513922758
--- FAIL: TestRandomSyntaxSQLSmith (377.67s)
Help

See also: How To Investigate a Go Test Failure (internal)

/cc @cockroachdb/sql-foundations

This test on roachdash | Improve this report!

Jira issue: CRDB-33641

@cockroach-teamcity cockroach-teamcity added branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 C-test-failure Broken test (automatically or manually discovered). O-robot Originated from a bot. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) labels Nov 19, 2023
@cockroach-teamcity cockroach-teamcity added this to the 23.2 milestone Nov 19, 2023
@rafiss rafiss changed the title sql/tests: TestRandomSyntaxSQLSmith failed sql/tests: TestRandomSyntaxSQLSmith failed [inner-join RelExpr children have intersecting columns] Nov 20, 2023
@rafiss rafiss added O-rsg Random Syntax Generator T-sql-queries SQL Queries Team and removed T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) labels Nov 20, 2023
@github-project-automation github-project-automation bot moved this to Triage in SQL Queries Nov 20, 2023
@mgartner
Copy link
Collaborator

Reduced reproduction:

CREATE TABLE t (
  k INT PRIMARY KEY,
  a INT
);

SELECT NULL
FROM t AS t1
JOIN t AS t2 ON t1.a = t2.a
WHERE t2.a = (SELECT 0:::OID FROM t);

@mgartner
Copy link
Collaborator

This reproduces on the release-23.2 branch, but not on release-23.1. I'll continue to investigate.

@mgartner
Copy link
Collaborator

This was introduced in #100881. It can be reproduced in v23.1.x versions by setting optimizer_hoist_uncorrelated_equality_subqueries=true. This setting is disabled by default in v23.1 and enabled by default in v23.2.

The problem is that the filter with the subquery is pushed into both sides of the join, then, later, both pushed subqueries are hoisted and after several more normalization rules apply, the children of a join produce columns with the same ID. It is similar to the problem described in #100915, but this is a new path to achieve the same issue.

I'm not yet sure what the implications of this are, but its not uncommon for overlapping column IDs to cause incorrect query plans:

// An operator is allowed to reuse some or all of the column ids of an input if:
//
// 1. For every output row, there exists at least one input row having identical
// values for those columns.
// 2. OR if no such input row exists, there is at least one output row having
// NULL values for all those columns (e.g. when outer join NULL-extends).
//
// For example, is it safe for a Select to use its input's column ids because it
// only filters rows. Likewise, pass-through column ids of a Project can be
// reused.
//
// For an example where columns cannot be reused, consider the query:
//
// SELECT * FROM a AS l JOIN a AS r ON (l.x = r.y)
//
// In this query, `l.x` is not equivalent to `r.x` and `l.y` is not equivalent
// to `r.y`. Therefore, we need to give these columns different ids.

So it's definitely not something to take lightly. I have a few possible mitigations in mind:

  1. Assign new column IDs to entire subquery expressions when they are pushed into two sides of a join. This is likely difficult.
  2. Disallow hoisting a subquery that has been duplicated, or allow hoisting a logical subquery once and only once.
  3. When hoisting a subquery that has been duplicated, create a new column to project the subquery result, and prevent the new column from being normalized away. This might actually prevent the very optimizations that hoisting the subquery is intended to allow, so this might not work.

@mgartner
Copy link
Collaborator

I don't think this should block the beta, but I do think there is some risk for the GA release here, so I'm labelling a GA blocker.

@mgartner mgartner added GA-blocker and removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Nov 20, 2023
@mgartner mgartner self-assigned this Nov 20, 2023
@mgartner mgartner moved this from Triage to Active in SQL Queries Nov 20, 2023
craig bot pushed a commit that referenced this issue Nov 29, 2023
115142: opt: hoist uncorrelated subqueries at most once r=mgartner a=mgartner

Since #100881, the optimizer has hoisted uncorrelated subqueries used in
an equality expression (only when the
`optimizer_hoist_uncorrelated_equality_subqueries` session setting is
enabled). This can cause problems when the hoisted subquery has been
duplicated in the expression tree, e.g., when pushing a filter into both
sides of a join.

A subquery is a scalar expression, so the columns of its child
expression are never emitted from the subquery. This makes it safe
duplicate a subquery in an expression tree. However, when a subquery is
hoisted, it is transformed into a join which can produce the columns of
the child expression. Hoisting the same subquery multiple times can
produce query plans with duplicate column IDs in two logically different
expressions. This can lead to incorrect query plans (see the comment for
`opt.Metadata`), as well as produce expressions with children that have
intersecting column IDs (after additional normalization rules fire).

To avoid these dangers, this commit ensures that each unique subquery is
hoisted at most once. This will prevent bad plans, but it may not
inhibit the optimizer from finding optimal plans. In the future, it may
be possible to lift this restriction by generating new column IDs for
uncorrelated subqueries each time they are hoisted.

Fixes #114703

There is no release note because the session setting enabling this bug
is disabled by default, and because the possible correctness bug is
theoretical - we have not found a reproduction of a correctness bug, but
it could exist in theory.

Release note: None


Co-authored-by: Marcus Gartner <[email protected]>
@craig craig bot closed this as completed in 4c4e5a4 Nov 29, 2023
@github-project-automation github-project-automation bot moved this from Active to Done in SQL Queries Nov 29, 2023
blathers-crl bot pushed a commit that referenced this issue Nov 29, 2023
Since #100881, the optimizer has hoisted uncorrelated subqueries used in
an equality expression (only when the
`optimizer_hoist_uncorrelated_equality_subqueries` session setting is
enabled). This can cause problems when the hoisted subquery has been
duplicated in the expression tree, e.g., when pushing a filter into both
sides of a join.

A subquery is a scalar expression, so the columns of its child
expression are never emitted from the subquery. This makes it safe
duplicate a subquery in an expression tree. However, when a subquery is
hoisted, it is transformed into a join which can produce the columns of
the child expression. Hoisting the same subquery multiple times can
produce query plans with duplicate column IDs in two logically different
expressions. This can lead to incorrect query plans (see the comment for
`opt.Metadata`), as well as produce expressions with children that have
intersecting column IDs (after additional normalization rules fire).

To avoid these dangers, this commit ensures that each unique subquery is
hoisted at most once. This will prevent bad plans, but it may not
inhibit the optimizer from finding optimal plans. In the future, it may
be possible to lift this restriction by generating new column IDs for
uncorrelated subqueries each time they are hoisted.

Fixes #114703

There is no release note because the session setting enabling this bug
is disabled by default, and because the possible correctness bug is
theoretical - we have not found a reproduction of a correctness bug, but
it could exist in theory.

Release note: None
blathers-crl bot pushed a commit that referenced this issue Nov 29, 2023
Since #100881, the optimizer has hoisted uncorrelated subqueries used in
an equality expression (only when the
`optimizer_hoist_uncorrelated_equality_subqueries` session setting is
enabled). This can cause problems when the hoisted subquery has been
duplicated in the expression tree, e.g., when pushing a filter into both
sides of a join.

A subquery is a scalar expression, so the columns of its child
expression are never emitted from the subquery. This makes it safe
duplicate a subquery in an expression tree. However, when a subquery is
hoisted, it is transformed into a join which can produce the columns of
the child expression. Hoisting the same subquery multiple times can
produce query plans with duplicate column IDs in two logically different
expressions. This can lead to incorrect query plans (see the comment for
`opt.Metadata`), as well as produce expressions with children that have
intersecting column IDs (after additional normalization rules fire).

To avoid these dangers, this commit ensures that each unique subquery is
hoisted at most once. This will prevent bad plans, but it may not
inhibit the optimizer from finding optimal plans. In the future, it may
be possible to lift this restriction by generating new column IDs for
uncorrelated subqueries each time they are hoisted.

Fixes #114703

There is no release note because the session setting enabling this bug
is disabled by default, and because the possible correctness bug is
theoretical - we have not found a reproduction of a correctness bug, but
it could exist in theory.

Release note: None
mgartner added a commit that referenced this issue Nov 30, 2023
Since #100881, the optimizer has hoisted uncorrelated subqueries used in
an equality expression (only when the
`optimizer_hoist_uncorrelated_equality_subqueries` session setting is
enabled). This can cause problems when the hoisted subquery has been
duplicated in the expression tree, e.g., when pushing a filter into both
sides of a join.

A subquery is a scalar expression, so the columns of its child
expression are never emitted from the subquery. This makes it safe
duplicate a subquery in an expression tree. However, when a subquery is
hoisted, it is transformed into a join which can produce the columns of
the child expression. Hoisting the same subquery multiple times can
produce query plans with duplicate column IDs in two logically different
expressions. This can lead to incorrect query plans (see the comment for
`opt.Metadata`), as well as produce expressions with children that have
intersecting column IDs (after additional normalization rules fire).

To avoid these dangers, this commit ensures that each unique subquery is
hoisted at most once. This will prevent bad plans, but it may not
inhibit the optimizer from finding optimal plans. In the future, it may
be possible to lift this restriction by generating new column IDs for
uncorrelated subqueries each time they are hoisted.

Fixes #114703

There is no release note because the session setting enabling this bug
is disabled by default, and because the possible correctness bug is
theoretical - we have not found a reproduction of a correctness bug, but
it could exist in theory.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-release-23.2 Used to mark GA and release blockers, technical advisories, and bugs for 23.2 C-test-failure Broken test (automatically or manually discovered). GA-blocker O-robot Originated from a bot. O-rsg Random Syntax Generator T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants