-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Highlighters shouldn't error on big documents #52155
Comments
Pinging @elastic/es-search (:Search/Highlighting) |
I am also wondering what if the first N characters don't contain any query terms to highlight? Is it ok to present a user with empty highlights in this case? |
For the Kibana use case I imagine having the Discover page showing some content (maybe with a ... to indicate there's more) would be better than a blank space or an error. |
We discussed this today and had a proposal for how users can avoid this error without resorting to any of the painful workarounds listed above.
The size parameter specifies the maximum field size (where field size is the length in bytes of a value or, if an array, the sum of all element value lengths). |
@markharwood : Apparently this error is also raised with fields that are not even indexed. Taking a look at this mapping:
It also generates errors as Kibana uses highlighting by default:
So, we have keywords with Just for your consideration for improving the highlighter. Don't really know if the fix should me more at Kibana side when using the highlighter (selecting better options) or at the highlighter engine to avoid failing search requests due to this. |
@eedugon Thanks for letting us know, we will try to address your use case as well |
We discussed this today and came up with two additional proposals:
|
Add a query parameter `limit_to_max_analyzed_offset` to allow users to limit the highlighting of text fields to the value of the `index.highlight.max_analyzed_offset`, thus preventing from throwing an exception when the length of the text field exceeds the limit. The highlighting still takes place but up to the length set by the setting. Relates to: elastic#52155
Add a `max_analyzed_offset` query parameter to allow users to limit the highlighting of text fields to a value less than or equal to the `index.highlight.max_analyzed_offset`, thus avoiding an exception when the length of the text field exceeds the limit. The highlighting still takes place, but stops at the length defined by the new parameter. Closes: #52155
…69016) Add a `max_analyzed_offset` query parameter to allow users to limit the highlighting of text fields to a value less than or equal to the `index.highlight.max_analyzed_offset`, thus avoiding an exception when the length of the text field exceeds the limit. The highlighting still takes place, but stops at the length defined by the new parameter. Closes: #52155 (cherry picked from commit f9af60b)
For the casual searcher it is not particularly helpful to have elasticsearch return an error if they are unlucky enough to match a big document.
The user gets a 400 error with this sort of message:
At this point the only workarounds the user has are:
a) User rewrites query with a NOT clause to exclude IDs of rogue docs (not ideal)
b) User reindexes content with offsets (a pain)
c) User reindexes content and truncates long strings e.g. with an ingest processor (not ideal)
d) User increases the max_analyzed_offset setting (not ideal)
None of these are great options so the proposal is that highlighters could be prevented from throwing an error and instead use a cheaper "fallback" approach to highlighting e.g. returning the first N characters of a large string field. The open questions are:
The text was updated successfully, but these errors were encountered: