-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid bucket copies in Aggregations #110261
Avoid bucket copies in Aggregations #110261
Conversation
Motivated by heap dumps for aggregations often containing mostly duplicate bucket instances. Note that This is a start, but there's a lot of duplication we can remove in follow-ups. But for this one, there's simply a lot of straight forward spots where we can avoid copying a bucket/list-of-buckets and derived instances.
Pinging @elastic/es-analytical-engine (Team:Analytics) |
); | ||
if (internalAggregation.equals(agg) == false) { | ||
if (internalAggregations == null) { | ||
internalAggregations = new ArrayList<>(reducedInternalAggs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to be tricky here IMO, copying the full list is likely similar in cost to doing anything tricky like copying the first I unchanged elements and then moving to a different loop or so, given how short these lists are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@@ -84,6 +84,9 @@ public InternalAggregation[] buildAggregations(long[] owningBucketOrds) throws I | |||
double key = roundKey * interval + offset; | |||
return new InternalHistogram.Bucket(key, docCount, keyed, formatter, subAggregationResults); | |||
}, (owningBucketOrd, buckets) -> { | |||
if (buckets.isEmpty()) { | |||
return buildEmptyAggregation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -340,6 +340,9 @@ public InternalAggregation[] buildAggregations(long[] owningBucketOrds) throws I | |||
return buildAggregationsForVariableBuckets(owningBucketOrds, bucketOrds, (bucketValue, docCount, subAggregationResults) -> { | |||
return new InternalDateHistogram.Bucket(bucketValue, docCount, keyed, formatter, subAggregationResults); | |||
}, (owningBucketOrd, buckets) -> { | |||
if (buckets.isEmpty()) { | |||
return buildEmptyAggregation(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Thanks Ignacio! |
Same as elastic#110261 but more globally applied. Removed copying of aggregation instances as well as needlessly wrapping bucket list (to make this equals freeish in most cases).
@original-brownbear @iverase do we know if this has any measurable improvement in memory usage or CPU? This change is causing some nasty bugs and we need to make sure either we back this change out, or fix the individual bugs. |
Yes it should, if I remember correctly this was motivated by a heap dump that showed O(100M) potential savings here. I'll try to look into this and see if we can make this safer easily, sorry for the bugs :/ Reverting would be a reasonable plan B here though IMO, there's so many issues in eggs and this I just one of them :) |
I'm going to revert this change now and we can tie it to a rework on |
@original-brownbear the only way to make this change work is to enforce that ^ Yeah, what @nik9000 said. |
This reverts elastic#110261 which we can't land until elastic#111757 - we need to be sure that the `equals` implementations on subclasses of `InternalAggregations` is correct before this optimization is safe.
This reverts elastic#110261 which we can't land until elastic#111757 - we need to be sure that the `equals` implementations on subclasses of `InternalAggregations` is correct before this optimization is safe. Closes elastic#111679
This reverts elastic#110261 which we can't land until elastic#111757 - we need to be sure that the `equals` implementations on subclasses of `InternalAggregations` is correct before this optimization is safe. Closes elastic#111679
This reverts elastic#110261 which we can't land until elastic#111757 - we need to be sure that the `equals` implementations on subclasses of `InternalAggregations` is correct before this optimization is safe. Closes elastic#111679
Motivated by heap dumps for aggregations often containing mostly duplicate bucket instances.
Note that This is a start, but there's a lot of duplication we can remove in follow-ups.
But for this one, there's simply a lot of straight forward spots where we can avoid copying a bucket/list-of-buckets and derived instances.