Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR upgrade to ES 7.9.0 and also fix #17 by inheriting DeferableBucketAggregator.
Breaking change on ES side:
Using DeferableBucketAggregator
See #17 for more context.
High cardinality aggregation consumes memory and time, and ES had an internal limit around the number of buckets you can build (prior to 7.8 during the collection phase, now during the reduce phase).
To prevent hitting this limit, we added the TermsAggregator inspired
shard_size
parameter, but we were still building all buckets (we need to sort them, and you had to build the bucket and its aggregations, in case you are sorting by an aggregation).This PR make our aggregator inherit DeferableBucketAggregator, and defer the collection of aggregation not used for sorting. We then order the buckets using a partiallyBuiltBucketComparator just like TermsAggregator. And finally we select the top
shard_size
buckets and fully build them. It should be faster. But it needs more tests.The next step will be to introduce the parameter CollectMode to allow disabling this behavior.
It would have been better to inherit TermsAggregator instead of duplicating this logic, but I learned the hard way that depending on ES class is not a good idea. TLDR, you cannot inherit InternalTerms because of a package-private abstract class. I don't know if it was a typo or on purpose.
Misc
A new task validateNebulaPom was failing for missing licenses and developers' information in generated pom. We aren't publishing, but instead of disabling it, I just added the missing info.
YamlRestTest build fails without log4j, I suspect it is a dependency of elasticsearch:test-framework that is not satisfied by the plugin
IntegTest are not deleted yet in 7.9.0, so I had to disable them until it is done upstream
Upgraded gradle to 6.5 (should have done 6.6 now that I think about it) because 6.1 was failing with Java 14
TODO
Tests, accuracy tests, perf tests
Release note
Use github action for CI and maybe release automation