Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the Weight#matches mode for highlighting by default #96068

Merged
merged 21 commits into from
Aug 9, 2023

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented May 12, 2023

This PR adapts the unified highlighter to use the Weight#matches mode by default when possible. This is the default mode in Lucene for some time now. For cases where the matches mode won't work (nested and parent-child queries),
the matches mode is disabled automatically.
I didn't expose an option to explicitly disable this mode because that should be seen as an internal implementation detail. With this change, matches that span multiple terms are highlighted together (something that users asked for years) and the clauses that don't match the document are ignored.

Note that this new mode is enabled only when require_field_match is true and the query doesn't contain:

  • A nested query.
  • A parent-join query.
  • A query that targets a runtime field.

Closes #29561

This PR adapts the unified highlighter to use the Weight#matches API by default when possible.
This is the default mode in Lucene for some time now. For cases where the matches API won't work (nested and parent-child queries),
 the matches mode is disabled automatically.
I didn't expose an  option to explicitly disable this mode because that should be seen as an internal implementation detail.
With this change, matches that span multiple terms are highlighted together (something that users asked for years) and the
clauses that don't match the document are ignored.
@jimczi jimczi added >enhancement :Search Relevance/Highlighting How a query matched a document Team:Search Meta label for search team v8.9.0 labels May 12, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Do you think we need to worry about backwards compatibility, or can we just count this is an improvement?

@jimczi
Copy link
Contributor Author

jimczi commented May 12, 2023

Do you think we need to worry about backwards compatibility, or can we just count this is an improvement?

That would be my preference yep, happy to discuss (and adapt) if we think it should be considered differently.

@jimczi
Copy link
Contributor Author

jimczi commented May 12, 2023

@romseygeek I disabled the Matches mode if a runtime field is queried or if require_field_match is set to false. In both scenario we're unable to extract the offsets from the original weight. I don't see how we could handle these cases with the current implementation but that shouldn't prevent this change imo.

@jimczi
Copy link
Contributor Author

jimczi commented Aug 8, 2023

@romseygeek, as discussed I added an undocumented index setting to disable the weight matches.
Can you take another look?

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I left one comment and a question, but no need for another review.

@jimczi jimczi changed the title Use the Weight#matches API for highlighting by default Use the Weight#matches mode for highlighting by default Aug 8, 2023
@jimczi jimczi merged commit 28a504d into elastic:main Aug 9, 2023
@jimczi jimczi deleted the unified_highlighter_matches branch August 9, 2023 01:44
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Aug 9, 2023
jimczi added a commit that referenced this pull request Aug 9, 2023
jimczi added a commit to jimczi/elasticsearch that referenced this pull request Aug 9, 2023
@TristanMa
Copy link

@jimczi I know the likelihood is low but is there any chance we can get this ER backported to 8.4? I have a customer who runs a multi-tenant cluster and cannot perform the upgrade to 8.10 in the near term and they have an escalated customer situation on their end.

I've set expectations that the chance of a backport is low but since you mentioned it as a possibility in an earlier comment just wanted to follow up.

@jimczi
Copy link
Contributor Author

jimczi commented Feb 13, 2024

I know the likelihood is low but is there any chance we can get this ER backported to 8.4?

Unfortunately not since 8.4.x is not released anymore.

@TristanMa
Copy link

@jimczi thanks for the quick response!

gustavolimav added a commit to gustavolimav/liferay-portal that referenced this pull request Feb 27, 2024
…asticsearch version

In elasticsearch 8.10.2 and higher versions highligh behaviour is different
more info here: elastic/elasticsearch#96068

https://liferay.atlassian.net/browse/LPD-2141
gustavolimav added a commit to gustavolimav/liferay-portal that referenced this pull request Feb 27, 2024
…asticsearch version

In elasticsearch 8.10.2 and higher versions highligh behaviour is different
more info here: elastic/elasticsearch#96068

https://liferay.atlassian.net/browse/LPD-2141
gustavolimav added a commit to gustavolimav/liferay-portal that referenced this pull request Feb 28, 2024
…asticsearch version

In elasticsearch 8.10.2 and higher versions highligh behaviour is different
more info here: elastic/elasticsearch#96068

https://liferay.atlassian.net/browse/LPD-2141
gustavolimav added a commit to gustavolimav/liferay-portal that referenced this pull request Feb 28, 2024
…asticsearch version

In elasticsearch 8.10.2 and higher versions highligh behaviour is different
more info here: elastic/elasticsearch#96068

https://liferay.atlassian.net/browse/LPD-2141
liferay-continuous-integration pushed a commit to liferay-continuous-integration/liferay-portal that referenced this pull request Mar 1, 2024
…asticsearch version

In elasticsearch 8.10.2 and higher versions highligh behaviour is different
more info here: elastic/elasticsearch#96068

https://liferay.atlassian.net/browse/LPD-2141
brianchandotcom pushed a commit to brianchandotcom/liferay-portal that referenced this pull request Mar 5, 2024
…asticsearch version

In elasticsearch 8.10.2 and higher versions highligh behaviour is different
more info here: elastic/elasticsearch#96068

https://liferay.atlassian.net/browse/LPD-2141
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search Relevance/Highlighting How a query matched a document Team:Search Meta label for search team v8.10.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Highlighter breaks phrases
5 participants