Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable optimization if we aren't sure its faster #74260

Merged
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions rest-api-spec/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,9 @@ tasks.named("yamlRestCompatTest").configure {
'search.aggregation/10_histogram/Deprecated _time order',
'search.aggregation/200_top_hits_metric/top_hits aggregation with sequence numbers',
'search.aggregation/20_terms/Deprecated _term order',
'search.aggregation/20_terms/*profiler*', // The profiler results aren't backwards compatible.
'search.aggregation/51_filter_with_types/Filter aggs with terms lookup and ensure it\'s cached',
'search.aggregation/370_doc_count_field/Test filters agg with doc_count', // Uses profiler for assertions which is not backwards compatible
'mtermvectors/11_basic_with_types/Basic tests for multi termvector get',
'mtermvectors/21_deprecated_with_types/Deprecated camel case and _ parameters should fail in Term Vectors query',
'mtermvectors/30_mix_typeless_typeful/mtermvectors without types on an index that has types',
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -756,8 +756,8 @@ setup:
---
"string profiler via global ordinals filters implementation":
- skip:
version: " - 7.12.99"
reason: filters implementation first supported with sub-aggregators in 7.13.0
version: " - 7.99.99"
reason: profile info changed in 8.0.0 to be backported to 7.14.0
- do:
bulk:
index: test_1
Expand Down Expand Up @@ -797,7 +797,7 @@ setup:
- match: { profile.shards.0.aggregations.0.type: StringTermsAggregatorFromFilters }
- match: { profile.shards.0.aggregations.0.description: str_terms }
- match: { profile.shards.0.aggregations.0.breakdown.collect_count: 0 }
- match: { profile.shards.0.aggregations.0.debug.delegate: FiltersAggregator.FilterByFilter }
- match: { profile.shards.0.aggregations.0.debug.delegate: FilterByFilterAggregator }
- match: { profile.shards.0.aggregations.0.debug.delegate_debug.filters.0.query: "str:cow" }
- match: { profile.shards.0.aggregations.0.debug.delegate_debug.filters.1.query: "str:pig" }
- match: { profile.shards.0.aggregations.0.debug.delegate_debug.filters.2.query: "str:sheep" }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,10 @@ setup:
---
"Test filters agg with doc_count":
- skip:
version: " - 7.12.99"
version: " - 7.99.99"
reason: profile info changed in 8.0.0 to be backported to 7.14.0
features: default_shards
reason: "name changed in 7.13"

- do:
search:
body:
Expand All @@ -177,7 +178,7 @@ setup:
- match: { aggregations.f.buckets.abc.doc_count: 11 }
- match: { aggregations.f.buckets.foo.doc_count: 8 }
- match: { aggregations.f.buckets.xyz.doc_count: 5 }
- match: { profile.shards.0.aggregations.0.type: FiltersAggregator.FilterByFilter }
- match: { profile.shards.0.aggregations.0.type: FilterByFilterAggregator }
# We can't assert that segments_with_doc_count_field is > 0 because we might
# end up with two shards and all of the documents with the _doc_count field
# may be on one field. We have a test for this in AggregationProfilerIT
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,6 @@
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.greaterThan;
import static org.hamcrest.Matchers.greaterThanOrEqualTo;
import static org.hamcrest.Matchers.notNullValue;

@ESIntegTestCase.SuiteScopeTestCase
Expand Down Expand Up @@ -643,21 +642,17 @@ public void testFilterByFilter() throws InterruptedException, IOException {
"delegate_debug",
matchesMap().entry("average_docs_per_range", equalTo(RangeAggregator.DOCS_PER_RANGE_TO_USE_FILTERS * 2))
.entry("ranges", 1)
.entry("delegate", "FiltersAggregator.FilterByFilter")
.entry("delegate", "FilterByFilterAggregator")
.entry(
"delegate_debug",
matchesMap().entry("segments_with_deleted_docs", 0)
.entry("segments_with_doc_count_field", 0)
.entry("max_cost", (long) RangeAggregator.DOCS_PER_RANGE_TO_USE_FILTERS * 2)
.entry("estimated_cost", (long) RangeAggregator.DOCS_PER_RANGE_TO_USE_FILTERS * 2)
.entry("estimate_cost_time", greaterThanOrEqualTo(0L)) // ~1,276,734 nanos is normal
.entry("segments_counted", 0)
.entry("segments_collected", greaterThan(0))
.entry(
"filters",
matchesList().item(
matchesMap().entry("scorers_prepared_while_estimating_cost", greaterThan(0))
.entry("query", "DocValuesFieldExistsQuery [field=date]")
matchesMap().entry("query", "DocValuesFieldExistsQuery [field=date]")
.entry("specialized_for", "docvalues_field_exists")
.entry("results_from_metadata", 0)
)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ public KeywordScriptFieldType(String name) {
this(name, StringFieldScript.PARSE_FROM_SOURCE, null, Collections.emptyMap(), (builder, params) -> builder);
}

KeywordScriptFieldType(
public KeywordScriptFieldType(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have to go through a ton of mocking without this and I'm explicitly trying to test runtime fields so I think its probably worth it to just make it public. There is a private builder factory thing but it goes through a lot of parsing infrastructure we don't have in the tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may be a dumb question, but can you not just use the public constructor above? Particularly as it looks like your use of this is just doing a source lookup, which is what the name-only constructor gives you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's a good question. I could use it if I built the _source but all the rest of those tests operate on doc values so I don't really have the infrastructure in them to build the source. Maybe that's simpler than I though and I should just do it. I'll have a look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a shot and it didn't work. I think those public ctors don't currently make useful fields. That's worth fixing, I think, but probably not in this PR.

String name,
StringFieldScript.Factory scriptFactory,
Script script,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ public LongScriptFieldType(String name) {
this(name, LongFieldScript.PARSE_FROM_SOURCE, null, Collections.emptyMap(), (builder, params) -> builder);
}

LongScriptFieldType(
public LongScriptFieldType(
String name,
LongFieldScript.Factory scriptFactory,
Script script,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ public abstract class AdaptingAggregator extends Aggregator {
public AdaptingAggregator(
Aggregator parent,
AggregatorFactories subAggregators,
CheckedFunction<AggregatorFactories, Aggregator, IOException> delegate
CheckedFunction<AggregatorFactories, ? extends Aggregator, IOException> delegate
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This delegate thing is required so we can "pin" the parent references in the sub aggregators. I wish we didn't have to do that stuff, but here we are.

) throws IOException {
// Its important we set parent first or else when we build the sub-aggregators they can fail because they'll call this.parent.
this.parent = parent;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
import org.apache.lucene.search.DocValuesFieldExistsQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.util.Bits;
import org.elasticsearch.common.CheckedSupplier;

import java.io.IOException;
import java.util.function.BiConsumer;
Expand Down Expand Up @@ -43,14 +42,6 @@ long count(LeafReaderContext ctx, FiltersAggregator.Counter counter, Bits live)
return super.count(ctx, counter, live);
}

@Override
long estimateCountCost(LeafReaderContext ctx, CheckedSupplier<Boolean, IOException> canUseMetadata) throws IOException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No more estimate*Cost methods because we use heuristics on query type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with estimate*Cost methods was that they couldn't detect when a query was slow to prepare because you had to prepare to get the cost. And, now that we disable the optimization if the top level is non-match-all there isn't really any complex heuristics to do. We have a lot more control over which queries come in on the filters, at least in the "optimize" route.

if (canUseMetadata.get() && canCountFromMetadata(ctx)) {
return 0;
}
return super.estimateCountCost(ctx, canUseMetadata);
}

private boolean canCountFromMetadata(LeafReaderContext ctx) throws IOException {
FieldInfo info = ctx.reader().getFieldInfos().fieldInfo(query().getField());
if (info == null) {
Expand Down
Loading