-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove the distinction between query and filter context in QueryBuilders #35354
Conversation
When building a query Lucene distinguishes two cases, queries that require to produce a score and queries that only need to match. We cloned this mechanism in the QueryBuilders in order to be able to produce different queries based on whether they need to produce a score or not. However the only case in es that require this distinction is the BoolQueryBuilder that sets a different minimum_should_match when a `bool` query is built in a filter context.. This behavior doesn't seem right because it makes the matching of `should` clauses different when the score is not required. Closes elastic#35293
Pinging @elastic/es-search-aggs |
@@ -384,30 +384,14 @@ protected Query doToQuery(QueryShardContext context) throws IOException { | |||
return new MatchAllDocsQuery(); | |||
} | |||
|
|||
final String minimumShouldMatch; | |||
if (context.isFilter() && this.minimumShouldMatch == null && shouldClauses.size() > 0) { | |||
minimumShouldMatch = "1"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the part that confuses me. Why do we need to set minimum_should_match
in this case ? I cannot think of one example where this fixes something ? @martijnvg can you confirm ?
heya @jimczi I was wondering, where is this change breaking? |
I forgot to add the breaking change ;). Today we set
When put in a filter context the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if also need to modify or completely remove query_filter_context.asciidoc
https://github.com/elastic/elasticsearch/blob/master/docs/reference/query-dsl/query_filter_context.asciidoc
Because as I understand with your PR, we will not have query and filter contexts any more. And also what this document says about caching doesn't seem to be right (as I understood we don't cache based on query and filter context, but based on the query type)
must match a document for it to match the `bool` query. This behavior may be | ||
explicitly controlled by settings the | ||
<<query-dsl-minimum-should-match,`minimum_should_match`>> parameter. | ||
|`should` |The clause (query) should appear in the matching document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also add "optionally" here, like "should optionally appear".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if also need to modify or completely remove query_filter_context.asciidoc https://github.com/elastic/elasticsearch/blob/master/docs/reference/query-dsl/query_filter_context.asciidoc
It's still relevant. The explanation implies that the query is sorted by relevancy so overall score of the query is taken into account. In this case the differenciation between a filter
and a should
clause is important. Same with the constant_score
query.
I think the main question here is whether we want to make this breaking change in 7 or not.
In such case we'd also need to add deprecation warnings in 6 for queries with a mixed of should and required clauses in a filter context. If we feel that it's too late for 7 we could only deprecate now with the goal to remove it in 8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 this distinction was mostly useful for the terms
query initially, which was parsed as a BooleanQuery when scoring was enabled and TermsQuery in a filter context. We should remove it now, I agree that the current behavior on bool
queries is quite confusing.
Thanks @jpountz , we discussed this some time ago and I forgot to update the pr. We've agreed that this change should be done in ES 7 and we'll deprecate the behavior in 6x. I'll update the pr with the missing breaking change and prepare the deprecation. |
This change deprecates the Elasticsearch's filter context and adds a deprecation warning to bool queries that automatically set their minimum should match to 1. Relates #35354
With this change, is this the shortest way to perform an unscored "OR"?
|
@matthuhiggins Yes, this is a proper way of performing "unscored disjunction (OR)". It's not the shortest way though as |
Got it, thanks. The release notes and deprecation warning leads one to think "minimum_should_match" always defaults to 0, even in the case when there are only "should" clauses. |
When building a query Lucene distinguishes two cases, queries that require to produce a score and queries that only need to match. We cloned this mechanism in the QueryBuilders in order to be able to produce different queries based on whether they need to produce a score or not. However the only case in es that require this distinction is the BoolQueryBuilder that sets a different minimum_should_match when a
bool
query is built in a filter context. This behavior doesn't seem right because it makes the matching ofshould
clauses different when the score is not required.Closes #35293