-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve block loader fallback to source when source mode is synthetic. #115394
Comments
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
This POC (#114886) shows how falling back to ignore source can work. |
Does this also cover cases where source filtering is used? In other words, when you only need to retrieve a specific field from _source, can we avoid synthesizing the full _source, which means fetching all fields? |
This issue is in the context of synthetic source, but the idea is to avoid synthesizing the full source when only subset of fields is required. This isn't the case today. |
My comment was also in context of synthetic source. Basically asking if we could also add an optimization to not re-construct the full _source when source filtering is used. Instead, just fetching fields that are required per the source filtering configuration. So if a doc has 100 fields, but the search request contains |
I am currently focused on #114618 and was planning to resume and finish work for #113827 after that. @felixbarny are you interested by this change for the get or the search API? I am asking because get is much simpler to achieve than search which is why we started there. |
I'm interested in _search. It's not something urgent. The question came up in the context of refactoring the APM UI to use fields. But some places still use _source with source filtering. Before the refactoring, there was a lot of usage of _source with filtering in the context of search. So I was suspecting that there may be other places that make use of _search + filtering that would have a performance regression when using synthetic _source. |
Sometimes
MappedFieldType#blockLoader(...)
implementations fallback to an implementation that uses source. For example when a field has doc values or stored fields disabled, when ignore above or ignore above have been configured. Meaning it would read the _source field and then extract the relevant field out of it and use that as value to be returned by the block loader.When synthetic source is enabled then instead the source gets computed from many doc value or stored fields, and then the relevant field gets extracted. This is very slow and this should be improved. The interesting part with synthetic source is that we don't need to compute the source in order to provided a fallback values as part of block loaders returned by
MappedFieldType#blockLoader(...)
.Synthetic source details relevant to block loader fallback logic:
_original
._ignored_source
stored field.In case of synthetic source the block loaders returned by
MappedFieldType#blockLoader(...)
can be made aware if these details and instead of returning aBlockSourceReader
based implementation, return an implementation that uses the right stored field or uses ignored source.Tasks:
The text was updated successfully, but these errors were encountered: