-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix aggregation memory leak for CCS #78404
Fix aggregation memory leak for CCS #78404
Conversation
When a CCS search is proxied, the memory for the aggregations on the proxy node would not be freed. Now does ref-counting on the `QuerySearchResult` object (and friends) in order to ensure that the memory for aggregations is eventually freed.
server/src/main/java/org/elasticsearch/search/query/QuerySearchResult.java
Outdated
Show resolved
Hide resolved
@elasticmachine update branch |
Pinging @elastic/es-analytics-geo (Team:Analytics) |
Pinging @elastic/es-distributed (Team:Distributed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch ! Although I think we can avoid the ref counting here. The fact that we delay the loading of the aggregation is relevant only in the coordinating node when reducing lots of shards. Scrolls don't expose aggregations so I'd prefer that we use an instance that don't use the bytes reference at all. So can we instead add an option in the ctr of the QuerySearchResult
to decide whether the reference should be kept or not ?
The default should be to fully read the aggs and we can opt-in for the reference in SearchTransportService#sendExecuteQuery ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, can you edit the description to reflect the new approach ?
Thanks Jim! |
💔 Backport failed
You can use sqren/backport to manually backport by running |
1 similar comment
💔 Backport failed
You can use sqren/backport to manually backport by running |
When a CCS search is proxied, the memory for the aggregations on the proxy node would not be freed. Now only use the non-copying byte referencing version on the coordinating node, which itself ensures that memory is freed by calling `consumeAggs`.
When a CCS search is proxied, the memory for the aggregations on the proxy node would not be freed. Now only use the non-copying byte referencing version on the coordinating node, which itself ensures that memory is freed by calling `consumeAggs`. Relates #72309
When a CCS search is proxied, the memory for the aggregations on the proxy node would not be freed. Now only use the non-copying byte referencing version on the coordinating node, which itself ensures that memory is freed by calling `consumeAggs`. Relates #72309
Adding comment from @tbrooks8 to clarify the impact:
|
Adding this as a known issue (#78404) to the release notes for 7.15.0 (fix will be in 7.15.1). Thx!
Adds #78404 as a known issue to the 7.15.0 and 7.14.n release notes. Co-authored-by: James Rodewig <[email protected]>
…) (#78788) Adds #78404 as a known issue to the 7.15.0 and 7.14.n release notes. Co-authored-by: James Rodewig <[email protected]> Co-authored-by: Pius <[email protected]>
…) (#78789) Adds #78404 as a known issue to the 7.15.0 and 7.14.n release notes. Co-authored-by: James Rodewig <[email protected]> Co-authored-by: Pius <[email protected]>
why not backport to 7.14 & 7.15? |
@jgq2008303393 it was back-ported to 7.15: see a5cc08f |
How about 7.14? Is this patch incompatible with 7.14 or is this patch not critical enough? @imotov |
It can be backported but there will be no more releases of 7.14 so it would be a wasted effort. Please see https://www.elastic.co/support/eol. If you have any question about this policy, let's continue this discussion on our discussion forum. We use github for bug reports and enhancement requests and this discussion is neither. |
Thanks for reply very much. |
When a CCS search is proxied, the memory for the aggregations on the
proxy node would not be freed.
Now only use the non-copying byte referencing version on the coordinating node,
which itself ensures that memory is freed by calling
consumeAggs
.Relates #72309