-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read Aggregations Directly from Pooled Buffers #72309
Read Aggregations Directly from Pooled Buffers #72309
Conversation
These aggregations can be of considerable size. We must not allocate a single `byte[]` of them. Especially nowadays, using G1GC, contiguous allocations of this size are problematic. This commit makes it so that we take the aggregation bytes as a slice out of the network buffers in a 0-copy fashion, cutting the peak memory use for reading them in half effectively for large allocations.
Pinging @elastic/es-analytics-geo (Team:Analytics) |
@@ -46,7 +47,7 @@ | |||
* when {@link #expand()} is called. | |||
*/ | |||
public static <T extends Writeable> DelayableWriteable<T> delayed(Writeable.Reader<T> reader, StreamInput in) throws IOException { | |||
return new Serialized<>(reader, in.getVersion(), in.namedWriteableRegistry(), in.readBytesReference()); | |||
return new Serialized<>(reader, in.getVersion(), in.namedWriteableRegistry(), in.readReleasableBytesReference()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would literally create O(100M) sized byte arrays in a recent user issue, completely locking up GC on the coordinating node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know we had a way to grab a slice of the buffer! That's quite nice. Is it safe to hold onto that reference for longer than the request? For, like, a lot longer?
Same here, this is great!
100M is indeed problematic but I wonder if this is not an abusive case that we should limit somewhere else. If a single shard response in a binary form has this size, I don't want to see the final json response. |
Yea as long as you eventually release the buffers and it's not a situation where you only care about a small piece of whatever buffer (which this isn't I'd say) it's completely safe. We're using the same mechanism already for recovery chunks, ccr-chunks and the cluster state.
Yea the linked issue is definitely an outlier I guess, but even if we're talking about O(1M) it's not great to allocate a bunch of byte arrays of that size all the time when working with G1GC (even after our fixes to settings). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic in the code looks good to me.
Do we check that buffers are released in our tests suite ? Or do we need explicit tests to ensure that we're not leaking bytes ? I really like this change but it makes me a little bit nervous considering the impact of a bug.
@@ -98,7 +99,8 @@ public T expand() { | |||
} catch (IOException e) { | |||
throw new RuntimeException("unexpected error writing writeable to buffer", e); | |||
} | |||
return new Serialized<>(reader, Version.CURRENT, registry, buffer.bytes()); | |||
// TODO: this path is currently not used in production code, if it ever is this should start using pooled buffers | |||
return new Serialized<>(reader, Version.CURRENT, registry, ReleasableBytesReference.wrap(buffer.bytes())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
production mode activated!
+1 to what Jim said. If the test suite auto-enforces that we close this
like we do with BigArrays then I'm good.
…On Tue, Apr 27, 2021, 15:30 Jim Ferenczi ***@***.***> wrote:
***@***.**** commented on this pull request.
The logic in the code looks good to me.
Do we check that buffers are released in our tests suite ? Or do we need
explicit tests to ensure that we're not leaking bytes ? I really like this
change but it makes me a little bit nervous considering the impact of a bug.
------------------------------
In
server/src/main/java/org/elasticsearch/common/io/stream/DelayableWriteable.java
<#72309 (comment)>
:
> @@ -98,7 +99,8 @@ public T expand() {
} catch (IOException e) {
throw new RuntimeException("unexpected error writing writeable to buffer", e);
}
- return new Serialized<>(reader, Version.CURRENT, registry, buffer.bytes());
+ // TODO: this path is currently not used in production code, if it ever is this should start using pooled buffers
+ return new Serialized<>(reader, Version.CURRENT, registry, ReleasableBytesReference.wrap(buffer.bytes()));
production mode activated!
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#72309 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUXIVPFNHNTBD3RJRTQDDTK4GGTANCNFSM43U5HM4Q>
.
|
Lgtm then!
…On Wed, Apr 28, 2021, 05:58 Armin Braun ***@***.***> wrote:
@nik9000 <https://github.com/nik9000> @jimczi <https://github.com/jimczi>
yes we have leak tracking infrastructure for all kinds of buffers. For our
own buffer pool it was added in #67688
<#67688> and we have similar
infrastructure for Netty based buffers running in all tests including the
REST tests. So any leak in test-covered code paths will show up for sure.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#72309 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUXIRV5WIQFGCK7E5EZF3TK7L4RANCNFSM43U5HM4Q>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM too !
Thanks Nik + Jim! |
These aggregations can be of considerable size. We must not allocate a single `byte[]` of them. Especially nowadays, using G1GC, contiguous allocations of this size are problematic. This commit makes it so that we take the aggregation bytes as a slice out of the network buffers in a 0-copy fashion, cutting the peak memory use for reading them in half effectively for large allocations.
If consuming a query result were disrupted by circuit breaker we would leak memory for aggs in buffered query results, fixed. Relates elastic#62439 and elastic#72309
When a CCS search is proxied, the memory for the aggregations on the proxy node would not be freed. Now only use the non-copying byte referencing version on the coordinating node, which itself ensures that memory is freed by calling `consumeAggs`. Relates #72309
When a CCS search is proxied, the memory for the aggregations on the proxy node would not be freed. Now only use the non-copying byte referencing version on the coordinating node, which itself ensures that memory is freed by calling `consumeAggs`. Relates #72309
These aggregations can be of considerable size. We must not allocate a single
byte[]
of them.Especially nowadays, using G1GC, contiguous allocations of this size are problematic.
This commit makes it so that we take the aggregation bytes as a slice out of the network buffers in
a 0-copy fashion, cutting the peak memory use for reading them in half effectively for large allocations.