-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL analyzer slow on many fields #104240
Comments
Pinging @elastic/es-analytics-geo (Team:Analytics) |
I have opened a quick fix in #104246. Feel free to take over this issue after that PR. |
We accidentally have O(N^3) when resolving the keep command. The first loop: https://github.com/elastic/elasticsearch/blob/8e3efae03df386dc80ab1a4ac3e620d16dc9828c/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java#L466 The second loop: https://github.com/elastic/elasticsearch/blob/8e3efae03df386dc80ab1a4ac3e620d16dc9828c/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java#L467 The third loop: https://github.com/elastic/elasticsearch/blob/8e3efae03df386dc80ab1a4ac3e620d16dc9828c/x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/analyzer/AnalyzerRules.java#L160 Resolving 2000 fields can take 35 seconds on a fast machine and more than 5 minutes on a slow machine. Although I think we should try to make this linear if possible, this quick fix only changes the resolution to O(N^2). This reduces the resolution time from 35s to 170ms (200 times faster) for 2000 fields. This is good enough to re-enable the HeapAttack tests. Relates #104240
I've made this O(N^2) from O(N^3) in #104246. I think we should make this linear (at least for such cases). |
@dnhatn thank you for spotting this one.
and the recorded times, can we do a bit more digging into this? For one, I am curious on what was the previous performance (before PR #103316) for the same amount of fields. Going from 35 seconds to 170 ms is amazing, but there was a third non-ideal solution (before #103316 which is a bug fix, but I don't remember the details of the bug) that we don't have any execution times for. |
ESQL Analyzer seems to be slow on analyzing many fields. It took more than 30 seconds to analyze 2000 fields in the
testHugeManyConcat
. In our CI this sometimes this sometimes takes more than 5 minutes.query.txt
The text was updated successfully, but these errors were encountered: