Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid materializing uncompressed HTTP response before compression #73719

Open
DaveCTurner opened this issue Jun 3, 2021 · 8 comments
Open

Avoid materializing uncompressed HTTP response before compression #73719

DaveCTurner opened this issue Jun 3, 2021 · 8 comments
Labels
:Distributed Coordination/Network Http and internode communication implementations >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.

Comments

@DaveCTurner
Copy link
Contributor

DaveCTurner commented Jun 3, 2021

Today when handling an HTTP request with Accept-encoding: gzip we will buffer the uncompressed response and then pass that over to Netty which takes responsibility for compressing it as it goes out over the wire. If the response is large and compressible then this means we consume substantially more memory than we theoretically need if we were to materialise the compressed bytes directly during serialisation.

I found this when looking for performance differences between internal monitoring vs monitoring with Metricbeat. Internal monitoring sends bulk requests to another cluster over HTTP having streamed the body of each request directly into a compressed buffer (which it then uncompresses during transmission, because reasons). In contrast, when responding to requests made by Metricbeat today we prepare the whole response in an uncompressed buffer first and this would still be true if Metricbeat indicated it would accept a compressed response. This means that serving a large response to Metricbeat will require more memory than pushing the equivalent data out using the HTTP monitoring exporter.

Edit to add: arguably for non-performance-critical APIs that serve large responses we could consider always materializing the response into a compressed buffer and then optionally decompressing it during the transmission, if we felt that the memory savings were worth it. I'm thinking of things like GET _stats?level=shards, GET _cluster/state and GET _mappings here, I suspect we wouldn't want to do this for high traffic APIs like GET _search and POST _bulk because of the potential CPU and latency hit.

@DaveCTurner DaveCTurner added >enhancement :Distributed Coordination/Network Http and internode communication implementations labels Jun 3, 2021
@elasticmachine elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jun 3, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@DaveCTurner
Copy link
Contributor Author

We (the @elastic/es-distributed team) discussed this today and decided that this would be possible and desirable. We're not planning on doing this in the immediate future but hope to get to it eventually.

@original-brownbear
Copy link
Member

original-brownbear commented Oct 20, 2021

This might be more desirable than previously thought #79560 ... at least it would give a bit of an out to make these APIs scale further than they do today by using compressing without any other API changes.

@DaveCTurner
Copy link
Contributor Author

I expect this doesn't apply to APIs using chunked encoding, right? With #89838 mostly done, I think this is good to close.

@original-brownbear
Copy link
Member

original-brownbear commented Feb 3, 2023

I think this is good to close.

I think it applies a little differently now and it would still be nice to handle this. Especially now that it's rather easy to do since we only have the Netty transport to deal with.

It's true that we're not wasting as much memory on materialising a full response now with chunked and I think the memory aspect can be considered solved.
But I could see us sending a really inefficient response for content that compresses well now. We chunk at the uncompressed size layer, but then compress chunks before they go out. So with 256k chunks like we have it now, it could be that those compress to say 5k or so per chunk and we send responses split into endless 5k chunks. This isn't that bad but it really does add needless overhead and I think it's rather easy to fix this issue for Netty only now.

@DaveCTurner
Copy link
Contributor Author

It doesn't look that easy (to me) because Netty's HttpContentCompressor looks to always eagerly flush the deflater and I don't see a great way to override this behaviour. Surely this is a solved problem, but my Google-fu is failing me.

@original-brownbear
Copy link
Member

I think you can just do this by manually compressing the content yourself and setting the content encoding header.
Looking at the code of io.netty.handler.codec.http.HttpContentCompressor#beginEncode and the calling io.netty.handler.codec.http.HttpContentEncoder#encode(io.netty.channel.ChannelHandlerContext, io.netty.handler.codec.http.HttpObject, java.util.List<java.lang.Object>) a solution that just puts the deflated bytes into the response chunks and sets the header on the first chunk should be all that's needed I think.

@DaveCTurner
Copy link
Contributor Author

Yeah that's pretty much the best idea I could see too but I would have preferred not to have to duplicate the logic that mucks around with the headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Network Http and internode communication implementations >enhancement Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination.
Projects
None yet
Development

No branches or pull requests

3 participants