Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DISTINCT ON query fails when specifying nulls order #90763

Closed
soupi opened this issue Oct 27, 2022 · 1 comment · Fixed by #93567
Closed

DISTINCT ON query fails when specifying nulls order #90763

soupi opened this issue Oct 27, 2022 · 1 comment · Fixed by #93567
Assignees
Labels
A-tools-hasura C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. E-quick-win Likely to be a quick win for someone experienced. O-community Originated from the community T-sql-queries SQL Queries Team

Comments

@soupi
Copy link

soupi commented Oct 27, 2022

Describe the problem

When trying to run a query with a DISTINCT ON clause and ORDER BY containing ASC NULLS LAST or DESC NULLS FIRST, the query fails with the following error:

SELECT DISTINCT ON expressions must match initial ORDER BY expressions

Note that the query works when specifying ASC, DESC, ASC NULLS FIRST or DESC NULLS LAST.

To Reproduce

DDL:

create table author (id int primary key, name text, genre text);

 insert into author values 
   (1, 'Alice', 'Action'),
   (2, 'Bob', 'Biography'),
   (3, 'Carol', 'Crime'),
   (4, 'Dave', 'Action'),
   (5, 'Eve', 'Crime'),
   (6, 'Bart', null);

SQL:

SELECT
  DISTINCT ON ("genre") *
FROM
  "public"."author"
ORDER BY
  "genre" ASC NULLS LAST,
  "id" ASC NULLS LAST;

Note that the query doesn't work with DESC NULLS FIRST either, but does work when specifying ASC, DESC, ASC NULLS FIRST or DESC NULLS LAST.

Expected behavior

Query succeeds with the following results:

 id | name  |   genre
----+-------+-----------
  1 | Alice | Action
  2 | Bob   | Biography
  3 | Carol | Crime
  6 | Bart  |
(4 rows)

Environment:

  • CockroachDB version: arm64-v22.2.0-beta.4-106-g153c346d06
  • Server OS: MacOS arm64
  • Client app: psql

Jira issue: CRDB-20934
Epic CRDB-11916

@soupi soupi added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Oct 27, 2022
@blathers-crl
Copy link

blathers-crl bot commented Oct 27, 2022

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I was unable to automatically find someone to ping.

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-untriaged blathers was unable to find an owner labels Oct 27, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Oct 27, 2022
@rytaft rytaft added the E-quick-win Likely to be a quick win for someone experienced. label Oct 27, 2022
@yuzefovich yuzefovich removed the X-blathers-untriaged blathers was unable to find an owner label Oct 28, 2022
@rytaft rytaft self-assigned this Dec 14, 2022
craig bot pushed a commit that referenced this issue Dec 14, 2022
93567: opt: add support for NULLS LAST with DISTINCT ON r=rytaft a=rytaft

This commit adds an exception for the requirement that the `ORDER BY` clause of a `DISTINCT ON` query must contain only columns in the `ON` clause. Specifically, it allows expressions of the form `col IS NULL`, where `col` is one of the `ON` columns. This is needed to support `NULLS LAST`, since we implement `NULLS LAST` by synthesizing a `col IS NULL` ordering column.

The approach feels a bit hacky, but it seems like the smallest, lowest risk option available.

Fixes #90763

Release note (bug fix): Fixed an issue where `DISTINCT ON` queries would fail with the error "SELECT DISTINCT ON expressions must match initial ORDER BY expressions" when the query included an `ORDER BY` clause containing `ASC NULLS LAST` or `DESC NULLS FIRST`.

Co-authored-by: Rebecca Taft <[email protected]>
@craig craig bot closed this as completed in 3366389 Dec 14, 2022
blathers-crl bot pushed a commit that referenced this issue Dec 14, 2022
This commit adds an exception for the requirement that the ORDER
BY clause of a DISTINCT ON query must contain only columns in the
ON clause. Specifically, it allows expressions of the form
`col IS NULL`, where col is one of the ON columns. This is needed
to support NULLS LAST, since we implement NULLS LAST by synthesizing
a `col IS NULL` ordering column.

The approach feels a bit hacky, but it seems like the smallest,
lowest risk option available.

Fixes #90763

Release note (bug fix): Fixed an issue where DISTINCT ON queries
would fail with the error "SELECT DISTINCT ON expressions must
match initial ORDER BY expressions" when the query included an
ORDER BY clause containing ASC NULLS LAST or DESC NULLS FIRST.
blathers-crl bot pushed a commit that referenced this issue Dec 14, 2022
This commit adds an exception for the requirement that the ORDER
BY clause of a DISTINCT ON query must contain only columns in the
ON clause. Specifically, it allows expressions of the form
`col IS NULL`, where col is one of the ON columns. This is needed
to support NULLS LAST, since we implement NULLS LAST by synthesizing
a `col IS NULL` ordering column.

The approach feels a bit hacky, but it seems like the smallest,
lowest risk option available.

Fixes #90763

Release note (bug fix): Fixed an issue where DISTINCT ON queries
would fail with the error "SELECT DISTINCT ON expressions must
match initial ORDER BY expressions" when the query included an
ORDER BY clause containing ASC NULLS LAST or DESC NULLS FIRST.
rytaft added a commit that referenced this issue Dec 15, 2022
This commit adds an exception for the requirement that the ORDER
BY clause of a DISTINCT ON query must contain only columns in the
ON clause. Specifically, it allows expressions of the form
`col IS NULL`, where col is one of the ON columns. This is needed
to support NULLS LAST, since we implement NULLS LAST by synthesizing
a `col IS NULL` ordering column.

The approach feels a bit hacky, but it seems like the smallest,
lowest risk option available.

Fixes #90763

Release note (bug fix): Fixed an issue where DISTINCT ON queries
would fail with the error "SELECT DISTINCT ON expressions must
match initial ORDER BY expressions" when the query included an
ORDER BY clause containing ASC NULLS LAST or DESC NULLS FIRST.

sql/logictest: fix flake in TestLogic_distinct_on

Make sure the query results are deterministic.

Fixes #93719

Release note: None
rytaft added a commit that referenced this issue Dec 15, 2022
This commit adds an exception for the requirement that the ORDER
BY clause of a DISTINCT ON query must contain only columns in the
ON clause. Specifically, it allows expressions of the form
`col IS NULL`, where col is one of the ON columns. This is needed
to support NULLS LAST, since we implement NULLS LAST by synthesizing
a `col IS NULL` ordering column.

The approach feels a bit hacky, but it seems like the smallest,
lowest risk option available.

Fixes #90763

Release note (bug fix): Fixed an issue where DISTINCT ON queries
would fail with the error "SELECT DISTINCT ON expressions must
match initial ORDER BY expressions" when the query included an
ORDER BY clause containing ASC NULLS LAST or DESC NULLS FIRST.
rytaft added a commit that referenced this issue Dec 15, 2022
This commit adds an exception for the requirement that the ORDER
BY clause of a DISTINCT ON query must contain only columns in the
ON clause. Specifically, it allows expressions of the form
`col IS NULL`, where col is one of the ON columns. This is needed
to support NULLS LAST, since we implement NULLS LAST by synthesizing
a `col IS NULL` ordering column.

The approach feels a bit hacky, but it seems like the smallest,
lowest risk option available.

Fixes #90763

Release note (bug fix): Fixed an issue where DISTINCT ON queries
would fail with the error "SELECT DISTINCT ON expressions must
match initial ORDER BY expressions" when the query included an
ORDER BY clause containing ASC NULLS LAST or DESC NULLS FIRST.
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
rytaft added a commit to rytaft/cockroach that referenced this issue Jul 29, 2023
This commit fixes an error that could occur in the optbuilder when planning a
query with DISTINCT ON and a non-standard nulls ordering.

The optimizer supports queries with a non-standard nulls ordering by projecting
a column with the expression (col IS NULL) and adding it to the ordering. Since
we require that DISTINCT ON columns must be the prefix of any ordering columns,
we must account for the new ordering column when building DISTINCT ON. A
previous bug fix for cockroachdb#90763 caused the new column to be simply ignored when
building DISTINCT ON, but this was insufficient. We need to actually include
the new column among the DISTINCT ON columns. This commit makes that change.

Fixes cockroachdb#107839

Release note (bug fix): Fixed a spurious error "no data source matches prefix"
that could occur during planning for a query with DISTINCT ON and ORDER BY ASC
NULLS LAST or ORDER BY DESC NULLS FIRST.
rytaft added a commit to rytaft/cockroach that referenced this issue Aug 7, 2023
This commit fixes an error that could occur in the optbuilder when planning a
query with DISTINCT ON and a non-standard nulls ordering.

The optimizer supports queries with a non-standard nulls ordering by projecting
a column with the expression (col IS NULL) and adding it to the ordering. Since
we require that DISTINCT ON columns must be the prefix of any ordering columns,
we must account for the new ordering column when building DISTINCT ON. A
previous bug fix for cockroachdb#90763 caused the new column to be simply ignored when
building DISTINCT ON, but this was insufficient. We need to actually include
the new column among the DISTINCT ON columns. This commit makes that change.

Fixes cockroachdb#107839

Release note (bug fix): Fixed a spurious error "no data source matches prefix"
that could occur during planning for a query with DISTINCT ON and ORDER BY ASC
NULLS LAST or ORDER BY DESC NULLS FIRST.
craig bot pushed a commit that referenced this issue Aug 7, 2023
107842: opt: fix error with DISTINCT ON and ORDER BY ASC NULLS LAST r=rytaft a=rytaft

This commit fixes an error that could occur in the optbuilder when planning a query with `DISTINCT ON` and a non-standard nulls ordering.

The optimizer supports queries with a non-standard nulls ordering by projecting a column with the expression `(col IS NULL)` and adding it to the ordering. Since we require that `DISTINCT ON` columns must be the prefix of any ordering columns, we must account for the new ordering column when building `DISTINCT ON`. A previous bug fix for #90763 caused the new column to be simply ignored when building `DISTINCT ON`, but this was insufficient. We need to actually include the new column among the `DISTINCT ON` columns. This commit makes that change.

Fixes #107839

Release note (bug fix): Fixed a spurious error "no data source matches prefix" that could occur during planning for a query with `DISTINCT ON` and `ORDER BY ASC NULLS LAST` or `ORDER BY DESC NULLS FIRST`.

108281: batcheval: remove stray log line r=dt a=dt

Release note: none.
Epic: none.

Fixes #108275.

Co-authored-by: Rebecca Taft <[email protected]>
Co-authored-by: David Taylor <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Aug 7, 2023
This commit fixes an error that could occur in the optbuilder when planning a
query with DISTINCT ON and a non-standard nulls ordering.

The optimizer supports queries with a non-standard nulls ordering by projecting
a column with the expression (col IS NULL) and adding it to the ordering. Since
we require that DISTINCT ON columns must be the prefix of any ordering columns,
we must account for the new ordering column when building DISTINCT ON. A
previous bug fix for #90763 caused the new column to be simply ignored when
building DISTINCT ON, but this was insufficient. We need to actually include
the new column among the DISTINCT ON columns. This commit makes that change.

Fixes #107839

Release note (bug fix): Fixed a spurious error "no data source matches prefix"
that could occur during planning for a query with DISTINCT ON and ORDER BY ASC
NULLS LAST or ORDER BY DESC NULLS FIRST.
blathers-crl bot pushed a commit that referenced this issue Aug 7, 2023
This commit fixes an error that could occur in the optbuilder when planning a
query with DISTINCT ON and a non-standard nulls ordering.

The optimizer supports queries with a non-standard nulls ordering by projecting
a column with the expression (col IS NULL) and adding it to the ordering. Since
we require that DISTINCT ON columns must be the prefix of any ordering columns,
we must account for the new ordering column when building DISTINCT ON. A
previous bug fix for #90763 caused the new column to be simply ignored when
building DISTINCT ON, but this was insufficient. We need to actually include
the new column among the DISTINCT ON columns. This commit makes that change.

Fixes #107839

Release note (bug fix): Fixed a spurious error "no data source matches prefix"
that could occur during planning for a query with DISTINCT ON and ORDER BY ASC
NULLS LAST or ORDER BY DESC NULLS FIRST.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tools-hasura C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. E-quick-win Likely to be a quick win for someone experienced. O-community Originated from the community T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants