-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Highlighting Error with span_field_masking
Requires Indexing Offsets Unexpectedly
#101804
Comments
span_field_masking
span_field_masking
Requires Indexing Offsets Unexpectedly
Pinging @elastic/es-search (Team:Search) |
This is due to But, to get around this bug,
Need to still dig into the correct fix here. |
error-trace:
|
@benwtrent Thank you for your suggestion. While running your suggested command, the error no longer occurs. However, I've noticed that the generated highlight doesn't match my expected output. With your command:
I was expecting the highlight to look like this:
Is there a way to achieve this expected result while avoiding the error? |
@ahoogol turn on offsets for the fields and use |
Thank you for your suggestion, @benwtrent. Yes, it highlights correctly when enabling offsets. But, my concern remains about the increase in index size. I'm still exploring alternative approaches to achieve the desired highlight without the need to turn on offsets to keep the index size manageable. If you have any further insights or suggestions, they would be greatly appreciated. |
@ahoogol If you use "highlight": {
"require_field_match" : false,
"pre_tags": "(",
"post_tags": ")",
"fields": {
"*": {}
},
"type": "unified"
} Why it breaks is that internally we check that we the field we highlight on "text" is the same that the field that has matches "stem", but in this case there are different. That's the failure. |
I will add this to documentation for |
Improvement includes: 1. Remove reference to Lucene queries (this information is not necessary for Elastic users, and can be outdated) 2. For `span_field_masking` include a node to use "require_field_match" : false parameter for highlighters to work. Closes elastic#101804
@mayya-sharipova I included Your suggestion output: (i tested it in 8.10.0 and 8.11.3)
Expected output:
|
@ahoogol Indeed you are right about the expected behaviour, but it is not supported on The highlighting behaviour that you expect is based on Matches and was added from 8.10. But it relies on the fact that the highlighted field contains query terms, which is not your case. I have added a documentation clarifying that I also modified the type of this issue as a "feature", that we may tackle sometime in the future. |
Improvement includes: 1. Remove reference to Lucene queries (this information is not necessary for Elastic users, and can be outdated) 2. For `span_field_masking` include a node to use "require_field_match" : false parameter for highlighters to work. Closes #101804
Improvement includes: 1. Remove reference to Lucene queries (this information is not necessary for Elastic users, and can be outdated) 2. For `span_field_masking` include a node to use "require_field_match" : false parameter for highlighters to work. Closes elastic#101804
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
Elasticsearch Version
8.10.4
Installed Plugins
No response
Java Version
bundled
OS Version
Elastic Cloud - GCP - Iowa (us-central1)
Problem Description
I encountered an issue when using the span_field_masking feature in Elasticsearch. When attempting to use the highlighter with this feature, the following error is thrown:
If I set "index_options": "offsets" in the mapping of the masked field 'stem', highlighting works as expected. However, I'm puzzled as to why the highlighter requires indexing offsets. I'd like to understand why the highlighter doesn't re-analyze the text to calculate offsets dynamically. My concern is that indexing offsets increases the index size, which I want to avoid.
Steps to Reproduce
Expected result
I was expecting the highlight to look like this:
The text was updated successfully, but these errors were encountered: