This repository has been archived by the owner on Aug 2, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 186
Lucene query pushdown optimization #671
Merged
dai-chen
merged 15 commits into
opendistro-for-elasticsearch:develop
from
dai-chen:lucene-pushdown-optimization
Aug 17, 2020
Merged
Lucene query pushdown optimization #671
dai-chen
merged 15 commits into
opendistro-for-elasticsearch:develop
from
dai-chen:lucene-pushdown-optimization
Aug 17, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dai-chen
force-pushed
the
lucene-pushdown-optimization
branch
from
August 11, 2020 16:00
da47f8b
to
d478810
Compare
penghuo
reviewed
Aug 15, 2020
...zon/opendistroforelasticsearch/sql/elasticsearch/storage/script/filter/lucene/TermQuery.java
Show resolved
Hide resolved
penghuo
approved these changes
Aug 15, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change!
chloe-zh
approved these changes
Aug 17, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,thanks!
penghuo
pushed a commit
to penghuo/sql
that referenced
this pull request
Aug 21, 2020
* Add lucene builder interface and term query impl * Add lucene query interface and term query impl * Add range query impl * Add more UT for range query * Add wildcard query impl * Add exists query impl * Pass jacoco test * Prepare PR * Only push down filter close to relation * Prepare PR * Add limitation doc * Add limitation doc * Add limitation doc
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue #, if available:
Description of changes: The major changes in this PR are new added
LuceneQuery
abstraction and its subclass to implement different Lucene query APIs. To clarify, Lucene query here means Lucene query via Elasticsearch DSL. We don't bypass DSL and call Lucene directly.Problem Statement: In PR #663, we register our expression as new script language and optimize expression evaluation by pushdown to script query in Elasticsearch DSL. However, what was not covered yet is that how we can leverage Lucene API for expression that can be optimized. For certain expression, we want to optimize filtering expression further by pushing down to Lucene query fully or partially.
Solution: The core logic is in
FilterQueryBuilder.visitFunction
method which is executed in the top down way:AND
,OR
,NOT
: Translate to bool query and visit left/right side recursivelyHere is an example to help understand.
AND
is translated to bool filter query directly (case #1). Thename = 'John'
is eligible because=
can be translated to Lucene term query and left side (first argument) is a reference and right side is a literal (case #2).ABS(age) = 30
is translated to a script query (case #3).Testing: new UT and PPL IT can pass. Since we don't have explain API yet, SQL IT and doctest will be added later.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.