Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Support select fields and alias in new query engine #636

Conversation

dai-chen
Copy link
Member

@dai-chen dai-chen commented Jul 29, 2020

Issue #, if available:

  1. Support field name with colon #306: Support field name with colon
  2. Special alias name cannot pass the old Druid and new ANTLR parser #209: Special alias name starting with number
  3. [New Engine] Function names in schema should keep consistent with input in letter cases #606: Preserve original name in query

Description of changes:

  1. New expression Alias (unresolved) and NamedExpression correspondent is added: Both wrap an ordinary expression with original text in query and its alias preserved.
  2. Unresolved, logical and physical project are changed accordingly: After changed, logical and physical project always expects named expression. Unresolved project operator accepts either Alias or not because it is shared by PPL which has different project (include/exclude).
  3. A data import bug is fixed for comparison test: Previously test data were loaded from CSV file and each field is added to bulk request with double quotes enclosed. However, Elasticsearch doesn't complain even if, for example age is integer declared in mapping, "30" is still valid though "30" is returned in response DSL rather than 30. Now when bulk request generated, test data mapping is used to remove double quotes for certain data types.

TODO items:

  1. Qualified name is not supported: For example, SELECT index.field, index_alias.field, object/nested.inner_field, text.keyword. In this case, syntax exception is thrown in ExpressionAnalyzer to fall back to old engine. => Will send a separate PR for index alias support.
  2. Object/nested field is not supported: Similarly, SELECT object, nested_field is not supported, in this case syntax exception is thrown in ExpressionAnalyzer too.
  3. Two integration tests are disabled: QueryIT.backticksQuotedIndexNameTest and PrettyFormatResponseIT.fieldsWithAlias are disabled temporarily, because new engine response is slightly different from the old, ex. no alias field in schema, string type is returned instead of text etc. => Will enable all after a new JDBC formatter is added to new engine.

Testing:

  1. Add new UT for Alias and named expression.
  2. Comparison test.
  3. New IT for bug fixes.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@dai-chen dai-chen added enhancement New feature or request SQL labels Jul 29, 2020
@dai-chen dai-chen self-assigned this Jul 29, 2020
@dai-chen dai-chen marked this pull request as ready for review July 30, 2020 20:10
@dai-chen dai-chen requested review from chloe-zh and penghuo July 30, 2020 20:10
Copy link
Contributor

@penghuo penghuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@chloe-zh chloe-zh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the changes!

@dai-chen dai-chen merged commit dee4aec into opendistro-for-elasticsearch:develop Jul 31, 2020
@dai-chen dai-chen deleted the support-select-specific-fields-merge branch July 31, 2020 19:38
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request SQL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants