-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distinguish between simple matches with and without the terms index #63945
Distinguish between simple matches with and without the terms index #63945
Conversation
Pinging @elastic/es-search (:Search/Mapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I stumbled into that oddity when working up tests for SigText after the VS refactoring, but didn't really have an idea how to fix it cleanly at the time :)
Probably best to wait on Mark's review since he knows the agg better, but 👍 from me
Do we need to add a note to the breaking-changes doc page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a couple of minor comments/questions, LGTM otherwise
} else { | ||
r = client().prepareSearch("cache_test_idx").setSize(0).addAggregation(significantTerms("foo").field("s")).get(); | ||
} | ||
r = client().prepareSearch("cache_test_idx").setSize(0).addAggregation(significantTerms("foo").field("s")).get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we lose coverage with these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, you're right, I had thought this was just testing a useless agg but actually I should change it instead to also have some text and to run the sigText agg over the text field instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
@@ -42,7 +42,7 @@ | |||
public static final VersionFieldType INSTANCE = new VersionFieldType(); | |||
|
|||
private VersionFieldType() { | |||
super(NAME, false, false, true, TextSearchInfo.SIMPLE_MATCH_ONLY, Collections.emptyMap()); | |||
super(NAME, false, false, true, TextSearchInfo.NONE, Collections.emptyMap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to have slipped when I double checked all the mappers, maybe it does not have a test so we don't enforce its consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there's no test for VersionFieldMapper - I'll add one as a follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you
if (ft.getTextSearchInfo() == TextSearchInfo.NONE) { | ||
return false; | ||
} | ||
return ft.getTextSearchInfo() != TextSearchInfo.SIMPLE_MATCH_WITHOUT_TERMS; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be return ft.getTextSearchInfo != NONE && gt.getTextSearchIngo != SIMPLE_MATCH_WITHOUT_TERMS ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
…ndexes longs" This reverts commit 807bae1.
We do, but I think I can only do that on backport as we don't have per-7.x-version changes in master? |
Looks good but I'm not sure about the Number fields support a
I recall they actually do a range query behind the scenes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a test that checks annotated_text is supported by significant_text?
Doh, yes, ignore me. :) |
I'm less worried about this, because it's an internal name and I think the javadoc is clear enough?
There isn't, I'll see if I can add one. |
Adding aggs tests to annotated text turns out to not be trivial. In the interest of not blocking this PR, I've opened a separate issue to deal with that. |
@elasticmachine run elasticsearch-ci/packaging-sample-unix (gradle download failure) |
…63945) We currently use TextSearchInfo to let query parsers know when a field will support match queries. Some field types (numeric, constant, range) can produce simple match queries that don't use the terms index, and it is useful to distinguish between these fields on the one hand and keyword/text-type fields on the other. In particular, the SignificantTextAggregation only works on fields that have indexed terms, but there is currently no efficient way to see this at search time and so the factory falls back on checking to see if an index analyzer has been defined, with the result that some nonsensical field types are permitted. This commit adds a new static TextSearchInfo implementation called SIMPLE_MATCH_WITHOUT_TERMS that can be returned by field types with no corresponding terms index. It changes significant text to check for this rather than for the presence of an index analyzer. This is a breaking change, in that the significant text agg will now throw an error up-front if you try and apply it to a numeric field, whereas before you would get an empty result.
We currently use TextSearchInfo to let query parsers know when a field
will support match queries. Some field types (numeric, constant, range)
can produce simple match queries that don't use the terms index, and
it is useful to distinguish between these fields on the one hand and
keyword/text-type fields on the other. In particular, the
SignificantTextAggregation only works on fields that have indexed terms,
but there is currently no efficient way to see this at search time and so
the factory falls back on checking to see if an index analyzer has been
defined, with the result that some nonsensical field types are permitted.
This commit adds a new static TextSearchInfo implementation called
SIMPLE_MATCH_WITHOUT_TERMS
that can be returned by fieldtypes with no corresponding terms index. It changes significant text
to check for this rather than for the presence of an index analyzer.
This is a breaking change, in that the significant text agg will now throw
an error up-front if you try and apply it to a numeric field, whereas before
you would get an empty result.