Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: when grouping RBR table by primary key, non-grouping columns must be in aggregate functions #99444

Closed
michae2 opened this issue Mar 23, 2023 · 0 comments · Fixed by #101675
Assignees
Labels
A-multiregion Related to multi-region A-sql-optimizer SQL logical planning and optimizations. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-qa T-sql-queries SQL Queries Team

Comments

@michae2
Copy link
Collaborator

michae2 commented Mar 23, 2023

When grouping a table by its primary key, we usually allow non-grouping columns in the SELECT clause without aggregation functions (because there is only a single row per group).

(The following uses cockroach demo --global --nodes 9 --multitenant=false --insecure.)

CREATE TABLE a (a INT PRIMARY KEY, b INT);
INSERT INTO a VALUES (0, 0), (1, 1), (2, 2);
-- we allow b without an aggregation function
SELECT a, b FROM a GROUP BY a;

This is still true when the table has LOCALITY REGIONAL BY TABLE:

SHOW REGIONS;
ALTER DATABASE defaultdb PRIMARY REGION "us-east1";
ALTER DATABASE defaultdb ADD REGION "europe-west1";
ALTER DATABASE defaultdb ADD REGION "us-west1";
SHOW CREATE DATABASE defaultdb;
-- should now have LOCALITY REGIONAL BY TABLE
SHOW CREATE TABLE a;
-- still works
SELECT a, b FROM a GROUP BY a;

But if we change the table to LOCALITY REGIONAL BY ROW then b is no longer allowed outside of an aggregation function, even if we constrain the query to a single region:

ALTER TABLE a ADD COLUMN region crdb_internal_region AS (
  CASE WHEN a % 3 = 0 THEN 'europe-west1'
       WHEN a % 3 = 1 THEN 'us-east1'
       WHEN a % 3 = 2 THEN 'us-west1'
  END
) STORED;
ALTER TABLE a ALTER COLUMN region SET NOT NULL;
ALTER TABLE a SET LOCALITY REGIONAL BY ROW AS region;
-- should now have LOCALITY REGIONAL BY ROW
SHOW CREATE TABLE a;
-- these all fail
SELECT a, b FROM a GROUP BY a;
SELECT a, b FROM a WHERE a = 0 GROUP BY a;
SELECT a, b FROM a WHERE a % 3 = 0 GROUP BY a;

The failures:

[email protected]:26257/defaultdb> SELECT a, b FROM a GROUP BY a;
ERROR: column "b" must appear in the GROUP BY clause or be used in an aggregate function
SQLSTATE: 42803
[email protected]:26257/defaultdb> SELECT a, b FROM a WHERE a = 0 GROUP BY a;
ERROR: column "b" must appear in the GROUP BY clause or be used in an aggregate function
SQLSTATE: 42803
[email protected]:26257/defaultdb> SELECT a, b FROM a WHERE a % 3 = 0 GROUP BY a;
ERROR: column "b" must appear in the GROUP BY clause or be used in an aggregate function
SQLSTATE: 42803

Jira issue: CRDB-25879

@michae2 michae2 added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-qa A-sql-optimizer SQL logical planning and optimizations. A-multiregion Related to multi-region T-sql-queries SQL Queries Team labels Mar 23, 2023
@msirek msirek self-assigned this Apr 17, 2023
msirek pushed a commit to msirek/cockroach that referenced this issue Apr 17, 2023
…ique

This extends support for queries such as this pseudo-SQL:
```
SELECT * FROM t1 GROUP BY pk_cols;
```
To also support:
```
SELECT * FROM t1 GROUP BY unique_index_non_nullable_key_cols;
SELECT * FROM t1 GROUP BY unique_without_index_non_nullable_key_cols;
```
All unique index or unique constraint columns must be not nullable for
the query to not error out.

Fixes: cockroachdb#99444

Release note (sql change): This patch adds support for non-aggregate
expressions involving columns outside of grouping columns when the
grouping column include all key columns of a unique index and those key
columns are not nullable.
msirek pushed a commit to msirek/cockroach that referenced this issue Apr 18, 2023
…ique

This extends support for queries such as this pseudo-SQL:
```
SELECT * FROM t1 GROUP BY pk_cols;
```
To also support:
```
SELECT * FROM t1 GROUP BY unique_index_non_nullable_key_cols;
SELECT * FROM t1 GROUP BY unique_without_index_non_nullable_key_cols;
```
All unique index or unique constraint columns must be not nullable for
the query to not error out.

Fixes: cockroachdb#99444

Release note (sql change): This patch adds support for non-aggregate
expressions involving columns outside of the grouping columns when the
grouping columns include all key columns of a unique index and those key
columns are not nullable.
craig bot pushed a commit that referenced this issue Apr 18, 2023
101675: optbuilder: allow non-aggregate expressions when GROUP BY cols are unique r=michae2 a=msirek

This extends support for queries such as this pseudo-SQL: 
```
SELECT * FROM t1 GROUP BY pk_cols;
```
To also support:
```
SELECT * FROM t1 GROUP BY unique_index_non_nullable_key_cols;
SELECT * FROM t1 GROUP BY unique_without_index_non_nullable_key_cols;
```
All unique index or unique constraint columns must be not nullable for  
the query to not error out.

Fixes: #99444 

Release note (sql change): This patch adds support for non-aggregate
expressions involving columns outside of the grouping columns when the
grouping columns include all key columns of a unique index and those key
columns are not nullable.

Co-authored-by: Mark Sirek <[email protected]>
@craig craig bot closed this as completed in 080fd89 Apr 19, 2023
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multiregion Related to multi-region A-sql-optimizer SQL logical planning and optimizations. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-qa T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants