Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ES|QL] Add CombineBinaryComparisons rule #110548

Merged
merged 13 commits into from
Jul 30, 2024

Conversation

fang-xing-esql
Copy link
Member

@fang-xing-esql fang-xing-esql commented Jul 5, 2024

Resolves #108525

This rule exists in sql, but not esql. After adding this rule the OR predicate in the test can be transformed into a boolean term. The sql rule is modified as ES|QL doesn't support range yet.

@elasticsearchmachine elasticsearchmachine added v8.16.0 Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jul 5, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @fang-xing-esql, I've created a changelog YAML for you.

@fang-xing-esql fang-xing-esql requested a review from astefan July 6, 2024 02:04
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe this change is enough.
This rule you are adding (from SQL) had tests in OptimizerRulesTests. We must add as many as possible back in ESQL.

@astefan
Copy link
Contributor

astefan commented Jul 9, 2024

Thinking about this some more, I think we didn't add this rule because of multi-values fields. EQL doesn't use this rule and EQL is multi-values aware. SQL, on the other hand, is single-value enforcer and uses it.

Please, look into the multi-values issue and see if this rule makes sense in this scenario.

@fang-xing-esql
Copy link
Member Author

Thinking about this some more, I think we didn't add this rule because of multi-values fields. EQL doesn't use this rule and EQL is multi-values aware. SQL, on the other hand, is single-value enforcer and uses it.

Please, look into the multi-values issue and see if this rule makes sense in this scenario.

I'll look into multi-valued fields and migrating more tests, there must be a reason that this rule was not added for ES|QL in the first place.

@fang-xing-esql
Copy link
Member Author

I don't believe this change is enough. This rule you are adding (from SQL) had tests in OptimizerRulesTests. We must add as many as possible back in ESQL.

Thanks for reviewing @astefan ! More tests(from OptimizerRulesTests) related to the CombineBinaryComparisons rule are added into esql in CombineBinaryComparisonsTests. The tests related to Range are not added, as Range is not supported in esql yet. The BinaryComparison operators the rule affects are ==, !=, >, >=, <, <=, they work with single value fields, they throw warnings if the fields are multi-valued. I did some tests on multi-valued fields, they seem work as expected.

@astefan astefan requested a review from bpintea July 18, 2024 11:28
Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be ok with multi-value fields, indeed. Thank you for checking.
I've left two more comments related to a better testing coverage.

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks
Please, wait also for @bpintea's approval before merging.

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why discard the Range formation? We're not having language support for smth. like BETWEEN, but I think Ranges could very well be used internally.

Right now an ESQL query like from employees | where salary > 30000 and salary < 50000
is translated into a DSL bool query with two distinctive range queries:

query[{"bool":{"must":[{"esql_single_value":{"field":"salary","next":{"range"
:{"salary":{"gt":30000,"boost":1.0}}},"source":"salary > 30000@5:38"}},{"esql_single_value":{"field":"salary","next":{"range":{"salary":{"lt":50000,"boost":1.0}}},"source":"salary  < 50000@5:57"}}],"boost":1.
0}}

We can optimise the filter output already and potentially enable other Range-based optimisation rules.

@fang-xing-esql
Copy link
Member Author

fang-xing-esql commented Jul 22, 2024

Why discard the Range formation? We're not having language support for smth. like BETWEEN, but I think Ranges could very well be used internally.

Right now an ESQL query like from employees | where salary > 30000 and salary < 50000 is translated into a DSL bool query with two distinctive range queries:

query[{"bool":{"must":[{"esql_single_value":{"field":"salary","next":{"range"
:{"salary":{"gt":30000,"boost":1.0}}},"source":"salary > 30000@5:38"}},{"esql_single_value":{"field":"salary","next":{"range":{"salary":{"lt":50000,"boost":1.0}}},"source":"salary  < 50000@5:57"}}],"boost":1.
0}}

We can optimise the filter output already and potentially enable other Range-based optimisation rules.

That's a good point! Although BETWEEN is not supported in ES|QL yet, Range can be useful for pushing down predicates in physical plan, I'll look more into it.

@fang-xing-esql
Copy link
Member Author

Thank you for reviewing @bpintea! As suggested, I created a separate PR #111255 to add both CombineBinaryComparisons and Range, it is a more complete solution, more complicated also :). I don't have a strong opinion to either including or not including Range, if we are going to support BETWEEN in ES|QL in the future, we will need Range at some point.

@elasticsearchmachine
Copy link
Collaborator

Hi @fang-xing-esql, I've updated the changelog YAML for you.

@fang-xing-esql
Copy link
Member Author

Thank you for reviewing @bpintea @astefan ! It was a long discussion, but hopefully it worths it :-). I'll merge this once CI is clear.

@fang-xing-esql fang-xing-esql merged commit ac94a6c into elastic:main Jul 30, 2024
15 checks passed
@fang-xing-esql fang-xing-esql deleted the CombineBinaryComparisons branch July 30, 2024 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] LogicalPlanOptimizerTests testSimplifyComparisonArithmeticWithDisjunction failing
4 participants