Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Support SELECT DISTINCT in new SQL engine #833

Conversation

dai-chen
Copy link
Member

@dai-chen dai-chen commented Nov 17, 2020

Issue #, if available: #582

Description of changes: Change SQL grammar and build AST for DISTINCT keyword. Currently distinct is treated as an equivalent and special form of GROUP BY clause. So there is no change required in core engine.

TODOs:

  1. Investigate how to support DISTINCT * in Elasticsearch (add note in doc)
  2. Migrate aggregate function with distinct to new engine as well, ex. COUNT(DISTINCT field).
  3. For the bug fix, will open a separate PR since I found CASE statement has problem with pushdown optimization.

Documentation: Add distinct * limitation and new support for distinct expression in https://github.com/dai-chen/sql/blob/support-distinct-in-new-engine/docs/user/dql/basics.rst#example-4-selecting-distinct-fields

Testing: Add UT and comparison test.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@dai-chen dai-chen self-assigned this Nov 17, 2020
@dai-chen dai-chen added the SQL label Nov 17, 2020
@dai-chen dai-chen changed the title Support distinct in new engine Support SELECT DISTINCT in new SQL engine Nov 17, 2020
@codecov
Copy link

codecov bot commented Nov 21, 2020

Codecov Report

Merging #833 (1b8a2f9) into develop (9f65c67) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             develop     #833   +/-   ##
==========================================
  Coverage      99.85%   99.85%           
  Complexity      2148     2148           
==========================================
  Files            216      216           
  Lines           4845     4850    +5     
  Branches         318      320    +2     
==========================================
+ Hits            4838     4843    +5     
  Misses             5        5           
  Partials           2        2           
Impacted Files Coverage Δ Complexity Δ
...icsearch/sql/sql/parser/AstAggregationBuilder.java 100.00% <100.00%> (ø) 15.00 <3.00> (ø)
...rch/sql/sql/parser/context/QuerySpecification.java 100.00% <100.00%> (ø) 15.00 <0.00> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f65c67...1b8a2f9. Read the comment docs.

@dai-chen dai-chen marked this pull request as ready for review November 26, 2020 05:28
@dai-chen dai-chen requested review from chloe-zh and penghuo November 26, 2020 05:35
@dai-chen dai-chen merged commit 0cb3240 into opendistro-for-elasticsearch:develop Dec 8, 2020
@dai-chen dai-chen deleted the support-distinct-in-new-engine branch December 8, 2020 17:13
joshuali925 pushed a commit that referenced this pull request Dec 9, 2020
* Change grammar

* Add UT and AST builder

* Pass jacoco

* Pass jacoco

* Add comparison test

* Add doctest

* Change doc

* Prepare PR

* Add doc for distinct * limitation
penghuo pushed a commit that referenced this pull request Dec 15, 2020
* Change grammar

* Add UT and AST builder

* Pass jacoco

* Pass jacoco

* Add comparison test

* Add doctest

* Change doc

* Prepare PR

* Add doc for distinct * limitation
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants