-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EQL: Convert wildcards to LIKE in analyzer #51901
Conversation
Pinging @elastic/es-search (:Search/EQL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass.
The PR will be easier to complete after #51929.
|
||
private static class ReplaceWildcards extends AnalyzeRule<Filter> { | ||
|
||
private static boolean isWildcard(Expression expr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can a wildcard appear in any type of string? e.g. some*glob
?
I wonder whether the parser could detect it so instead of having Literal that might a string, it could have its own expression rule.
if (expr instanceof Literal) { | ||
Literal l = (Literal) expr; | ||
if (l.value() instanceof String) { | ||
String s = (String) l.value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A potential improvement is to check whether an expression is foldable instead of being a literal.
Thus if string concatenation were to be added, the rule would still be applied:
if (e.foldable() && e.fold() instanceof String) {
return e.fold().toString().contains("*");
}
which can be transformed into a one-liner:
return e.foldable() && e.fold() instanceof String && e.fold().toString().contains("*");
@Override | ||
protected LogicalPlan rule(Filter filter) { | ||
return filter.transformExpressionsUp(e -> { | ||
// expr == "wildcard*phrase" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about foo > "wild*card"
or other value comparisons?
If that's valid grammar, the verifier should pick the pattern and fail the query.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it is valid grammar, but since it's not ==
or !=
, this will just be a lexicographical comparison
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would mean the operators are not consistent since ==
would expand the wildcard while >
& co would compare against it...
Equals eq = (Equals) e; | ||
|
||
if (isWildcard(eq.right())) { | ||
String wcString = (String) ((Literal) eq.right()).value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isWildcard already does the checks so simply do: eq.fold().toString()
x-pack/plugin/eql/src/main/java/org/elasticsearch/xpack/eql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment regarding tests.
Literal l = (Literal) expr; | ||
if (l.value() instanceof String) { | ||
String s = (String) l.value(); | ||
return s.contains("*"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be possible to add more tests that also look at scenarios involving escape characters and all types of string that eql supports?
Can the *
be escaped? If so, we should have a test covering this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I can add more tests. And there's no escape for *
. But if you want to perform an exact (+/- case sensitivity) comparison, you can put the wildcard on the left (#51901 (comment)). Whether this functionality is good or not is a fair question, and I think it's fair to change this because I doubt any users are aware of the workaround.
field == "*wildcard*" <==> field LIKE "%wildcard%"
"*wildcard*" == field <==> field == '*wildcard*'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, I find this more like a bug, rather than feature (that implies an workaround).
A == B and B == A should be equivalent imo. Equality is not a predicate (like LIKE
) where a certain element sits on the right and another one sits on the left and they are not interchangeable.
I think it's worth discussing the feature and then figure out the implementation for it.
|
Agreed that wildcards should be made commutative. I don't think we should add a new construct like LIKE just yet. In the future, I think it could make sense, after we go through the usual deprecation steps. I think the main question I have is -- should this be a separate rule in the grammar, or should it be handled in the analyzer? |
If the wildcard can be picked up by the parser through a dedicated rule, I would opt for that. @andrei how close are you to moving the optimizer rules in? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -33,6 +43,7 @@ public LogicalPlan optimize(LogicalPlan verified) { | |||
new BooleanSimplification(), | |||
new BooleanLiteralsOnTheRight(), | |||
// needs to occur before BinaryComparison combinations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment seems incorrect as it referred to PropagateEquals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where should ReplaceWildcards() be moved to?
x-pack/plugin/eql/src/main/java/org/elasticsearch/xpack/eql/optimizer/Optimizer.java
Show resolved
Hide resolved
elasticsearch-ci/2 is failing due to a checkstyle violation. |
* EQL: Convert wildcard comparisons to Like * EQL: Simplify wildcard handling, update tests * EQL: Lint fixes for Optimizer.java
Addressing the comment thread from #51558 (comment).
Added ReplaceWildcards to the optimizer which detects the
== "wild*card*"
or!= "wild*card*"
patterns and replaces with LIKE.This is branched from #51886, so only the last commit is relevant.Update: Resolves #53104