-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Visit stored fields once #101322
Comments
Pinging @elastic/es-ql (Team:QL) |
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL) |
The performance we get from this varies vastly depending on how many stored fields you load. If you load none it obviously won't do anything. If you load two this'll make ESQL like twice as fast. Or, like, 1.8 times as fast. Visiting stored fields is, compared to most of what we do, super duper slow! |
Some complexities here - our field loading infrastructure is capable of loading fields from several indices at once. It has a "fast path" that it can follow when we load a single index in ascending order. It has a "slow path" that it can use when it loads from either more than one index or when the fields aren't in order. The tricky bit is the slow path. I think the simplest way to deal with this may be to separate the code for loading from the slow path and the fast path. Or, rather, I've tried not doing that and it is a snarly mess. I'll try the separation the next chance I get. |
I've made a PR (#102192) that teaches the field loading infrastructure to load many fields at once which we can use to bunch up our visits to stored fields. I'll need some help on the planner side to modify the plan to do the actual bunching. |
Happy to help on the planner side, let's take it off-line |
This modifies ESQL to load a list of fields at one time which is especially effective when loading from stored fields or _source because it allows visiting the stored fields one time. Part of elastic#101322
#102408 got it. |
Description
ESQL loads stored fields like the are doc values. The interface assumes that it can load fields "column-wise". But stored fields are a "row-wise" store! So ESQL loads stored fields by visiting each row once per value. That's terribly slow because the "visiting" involves decompressing whole blocks of values with a dictionary.
We could make this so so so much faster if we loaded stored fields in their own operator with their own interface we could load them row-wise. Like, visit each document one time! Just like the fetch phase. We could do similar things for synthetic _source too one day!
The text was updated successfully, but these errors were encountered: