Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for wildcard_query function to the new engine (#156) #1108

Merged
merged 4 commits into from
Dec 7, 2022

Conversation

GumpacG
Copy link
Collaborator

@GumpacG GumpacG commented Nov 25, 2022

Description

Add support for wildcard_query function:

Add support for LIKE function:

  • LIKE function uses wildcard_query
  • LIKE in the legacy engine converts text fields to keyword
  • Added further testing for LIKE for different field types

Issues Resolved

#1032

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

* Implemented wildcard_query and added tests in core

Signed-off-by: Guian Gumpac <[email protected]>

* Implemented and added tests for sql

Signed-off-by: Guian Gumpac <[email protected]>

* Implemented and added tests for ppl

Signed-off-by: Guian Gumpac <[email protected]>

* Implemented and added tests for lucene

Signed-off-by: Guian Gumpac <[email protected]>

* Fixed test for like expression

Signed-off-by: Guian Gumpac <[email protected]>

* Added parameters to wildcard_query

Signed-off-by: Guian Gumpac <[email protected]>

* Added integration tests for ppl and sql

Signed-off-by: Guian Gumpac <[email protected]>

* Added docs for doctests

Signed-off-by: Guian Gumpac <[email protected]>

* Fixed issues introduced during merging

Signed-off-by: Guian Gumpac <[email protected]>

* Addressed PR comment

Signed-off-by: Guian Gumpac <[email protected]>

* Added annotation that was deleted from merging

Signed-off-by: Guian Gumpac <[email protected]>

* Fixed merge conflict issues

Signed-off-by: Guian Gumpac <[email protected]>

* Addressed some PR comments and handled escaping wildcards

Signed-off-by: Guian Gumpac <[email protected]>

* Added tests for wildcard conversion and created data for testing

Signed-off-by: Guian Gumpac <[email protected]>

* Added javadoc

Signed-off-by: Guian Gumpac <[email protected]>

* Changed index name

Signed-off-by: Guian Gumpac <[email protected]>

* Temporarily changed jackson_version to run GH actions

Signed-off-by: Guian Gumpac <[email protected]>

* Added comparison test for wildcard conversion

Signed-off-by: Guian Gumpac <[email protected]>

* Removed PPL implementation of wildcard_query

Signed-off-by: Guian Gumpac <[email protected]>

* Reverted ppl docs change

Signed-off-by: Guian Gumpac <[email protected]>

* Made namedArgument a static function

Signed-off-by: Guian Gumpac <[email protected]>

* Removed extra space

Signed-off-by: Guian Gumpac <[email protected]>

* Fixed LIKE query

Signed-off-by: Guian Gumpac <[email protected]>

* Fixed LIKE tests and added more tests

Signed-off-by: Guian Gumpac <[email protected]>

* Addressed PR comments

Signed-off-by: Guian Gumpac <[email protected]>

* Implemented converting text field to keyword. Still needs testing

Signed-off-by: Guian Gumpac <[email protected]>

* Added test cases for LIKE in sql and ppl

Signed-off-by: Guian Gumpac <[email protected]>

* Addressed PR comments regarding docs

Signed-off-by: Guian Gumpac <[email protected]>

* Fixed backslashes in docs

Signed-off-by: Guian Gumpac <[email protected]>

* Added missed backticks in docs

Signed-off-by: Guian Gumpac <[email protected]>

* Moved escaping wildcard test to common/utils

Signed-off-by: Guian Gumpac <[email protected]>

* Fixed checkstyle error

Signed-off-by: Guian Gumpac <[email protected]>

Signed-off-by: Guian Gumpac <[email protected]>
@GumpacG GumpacG requested a review from a team as a code owner November 25, 2022 19:12
@codecov-commenter
Copy link

codecov-commenter commented Nov 25, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.83%. Comparing base (354e843) to head (c8be912).
Report is 399 commits behind head on 2.x.

Additional details and impacted files
@@             Coverage Diff              @@
##                2.x    #1108      +/-   ##
============================================
- Coverage     98.31%   95.83%   -2.48%     
- Complexity     3485     3500      +15     
============================================
  Files           348      360      +12     
  Lines          8707     9407     +700     
  Branches        555      678     +123     
============================================
+ Hits           8560     9015     +455     
- Misses          142      334     +192     
- Partials          5       58      +53     
Flag Coverage Δ
query-workbench 62.76% <ø> (?)
sql-engine 98.31% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -57,7 +58,7 @@ public class FilterQueryBuilder extends ExpressionNodeVisitor<QueryBuilder, Obje
.put(BuiltinFunctionName.GREATER.getName(), new RangeQuery(Comparison.GT))
.put(BuiltinFunctionName.LTE.getName(), new RangeQuery(Comparison.LTE))
.put(BuiltinFunctionName.GTE.getName(), new RangeQuery(Comparison.GTE))
.put(BuiltinFunctionName.LIKE.getName(), new WildcardQuery())
.put(BuiltinFunctionName.LIKE.getName(), new LikeQuery())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use WildcardQuery?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently LIKE uses WildcardQuery.java in V2 while wildcard_query uses WildcardQuery.java in legacy. We want to classify wildcard_query as a relevance function however, since LIKE in V2 already uses WildcardQuery.java then it had to be separated as it uses ReferenceExpression instead of NamedArgumentExpression which is used in relevance functions. Making LIKE use wildcard_query results in a ClassCastException.

LIKE also converts text fields to keyword while wildcard_query in the legacy does not. Thus, LIKE using WildcardQuery as a relevance function would be a breaking change. If we want wildcard_query to convert text fields to keyword, then we would want to make it so for all relevance functions and can be a separate issue to enhance relevance functions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. In SQL engine, like is simplified version of wildcard_query, right? If so, like and wildcard_query should rewrite as OpenSearch wildcard_query. The differences are, (1) rewrite Like to wildcard query and use default parameter (2) rewrite wildcard_query to wildcard query based on user provided parameters.
  2. I think breaking change maybe make sense. Only field of keyword type could use LIKE operator. If it is multitype field, user need to explict specifiy it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaking change is tracked with #1143 and should be implemented in 3.x. This can be merged in 2.x to allow for legacy deprecation.

* @param text string to be converted
* @return converted string
*/
public static String convertSqlWildcardToLucene(String text) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why put it in common module instead of openseach module?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will move this. Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a31c970

penghuo
penghuo previously approved these changes Dec 6, 2022
dai-chen
dai-chen previously approved these changes Dec 6, 2022
Signed-off-by: Guian Gumpac <[email protected]>
@GumpacG GumpacG dismissed stale reviews from dai-chen and penghuo via c8be912 December 7, 2022 03:51
@GumpacG GumpacG requested review from penghuo and dai-chen and removed request for penghuo and dai-chen December 7, 2022 16:21
@penghuo penghuo merged commit 2af7321 into opensearch-project:2.x Dec 7, 2022
@dai-chen dai-chen added the enhancement New feature or request label Dec 8, 2022
@GumpacG GumpacG deleted the integ-add-wildcardquery branch January 21, 2023 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants