Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify what query components cross_fields supports #68795

Merged
merged 3 commits into from
Feb 18, 2021

Conversation

jtibshirani
Copy link
Contributor

The parsing logic for the cross_fields mode was very general, making it hard
to tell what query components it actually supports. This PR clarifies that only
'bag of words' queries are supported, it does not accept phrase or prefix
queries. It also renames BlendedQueryBuilder -> CrossFieldsQueryBuilder for
clarity.

@jtibshirani
Copy link
Contributor Author

jtibshirani commented Feb 10, 2021

This refactor isn't high priority on its own, since we don't touch cross_fields much these days. But it helps prepare for adding a new combined-field query mode (proposed in #41106 (comment)). The new mode will follow a very similar structure to cross_fields and support the same components.

@@ -184,15 +205,7 @@ protected Query newTermQuery(Term term, float boost) {

@Override
protected Query newPrefixQuery(Term term) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple notes:

  • Although the cross_fields query doesn't allow phrases, the analyzePhrase method must be implemented to support multi-word synonyms with auto_generate_synonyms_phrase_query: true.
  • I think we can remove analyzeMultiPhrase. This method supports phrase queries that contain a term with synonyms, which should never occur with cross_fields. I opted not to remove it since the logic is tricky and I didn't want to risk a regression when we aim to deprecate cross_fields anyways.

@jtibshirani jtibshirani marked this pull request as ready for review February 10, 2021 01:23
Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice cleanup

@jtibshirani jtibshirani merged commit 164d991 into elastic:master Feb 18, 2021
@jtibshirani jtibshirani deleted the cross-fields-query branch February 18, 2021 18:20
@jtibshirani jtibshirani added :Search/Search Search-related issues that do not fall into other categories >refactoring v7.12.0 v8.0.0 labels Feb 24, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Feb 24, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

jtibshirani added a commit that referenced this pull request Aug 23, 2021
When backporting #68795, we introduced a regression around `match` queries on
field aliases. If the `match` query parses to a `synonym` query (because its
analyzer can produce multiple tokens at a single position), then we don't
resolve the field alias to its concrete field. This means the query will not
have any results.

We already had a test for this case but it was too narrow. Note this regression
only affects 7.x, there is no bug on master.
jtibshirani added a commit that referenced this pull request Aug 23, 2021
When backporting #68795, we introduced a regression around `match` queries on
field aliases. If the `match` query parses to a `synonym` query (because its
analyzer can produce multiple tokens at a single position), then we don't
resolve the field alias to its concrete field. This means the query will not
have any results.

We already had a test for this case but it was too narrow. Note this regression
only affects 7.x, there is no bug on master.
jtibshirani added a commit that referenced this pull request Aug 23, 2021
When backporting #68795, we introduced a regression around `match` queries on
field aliases. If the `match` query parses to a `synonym` query (because its
analyzer can produce multiple tokens at a single position), then we don't
resolve the field alias to its concrete field. This means the query will not
have any results.

We already had a test for this case but it was too narrow. Note this regression
only affects 7.x, there is no bug on master.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>refactoring :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v7.13.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants