Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce DecodeBufferPolicy to reduce memory usage #1314

Closed
Shaphan opened this issue Jun 18, 2020 · 4 comments
Closed

Introduce DecodeBufferPolicy to reduce memory usage #1314

Shaphan opened this issue Jun 18, 2020 · 4 comments
Labels
type: task A general task
Milestone

Comments

@Shaphan
Copy link
Contributor

Shaphan commented Jun 18, 2020

Bug Report

Firstly, here is the link to the private repository with a sample app that helps to reproduce the leak, or, to be more precise, clearly shows that lettuce behavior from the direct memory usage perspective changed between 5.1.8 and 5.2.x: https://github.com/Shaphan/lettuce-direct-memory-leak-reproducer
Access to the repository is already granted to @mp911de
I can also share the repository with somebody else, if necessary, but would like to avoid making the repository public for the time being.

After upgrading to io.lettuce:lettuce-core:5.2.2.RELEASE we started seeing increased memory consumption. We tracked this down to direct memory and preliminarily identified the commit 4f026e6 as the one introducing the issue in 5.2.x branch (was addressing issue #906, PR #916).

Current Behavior

Faster direct memory growth in 5.2.2.RELEASE as compared to 5.1.8.RELEASE.
Higher probability to run into OutOfDirectMemoryErrors / OutOfMemoryErrors under the same conditions.

Expected behavior/code

Direct memory usage at the same level as in versions prior to 5.2.x.

Environment

  • Lettuce version(s): 5.2.2.RELEASE, 5.3.0.RELEASE
  • Redis version: we were running into the issue on 5.0.6 and 5.0.7, but the version of Redis should not really matter here
  • Java 8+

Possible Solution

In our code we worked that around with help of reflection setting bufferUsageRatio in io.lettuce.core.ClientOptions.Builder.
This effectively makes CommandHandler behave in pre-5.2 way.
An example:

            ClientOptions.Builder optionsBuilder = ClientOptions.builder()
            Field field = ClientOptions.Builder.class.getDeclaredField("bufferUsageRatio");
            field.setAccessible(true);
            field.setInt(optionsBuilder, 0);
            ...

Additional context

Please find screenshots showing direct memory usage from the aforementioned Sample app runs on different versions of lettuce (all the other parameters were the same):

image
image
image

@Shaphan Shaphan added the type: bug A general bug label Jun 18, 2020
@mp911de
Copy link
Collaborator

mp911de commented Jun 24, 2020

Thanks for the elaborate report. Looking at the code, each worker creates its own connection, RedisClient and (what's actually worse) ClientResources along with individual thread pools. No surprise that 2000 workers * Runtime.availableProcessors() cause memory exhaustion.

Another aspect to note is that each connection maintains an aggregation buffer to buffer partial responses. Having 2000 connections instantiates 2000 times aggregation buffers.

Starting the reproducer as-is causes immediately a too many open files failure. Running the reproducer with a single connection and client instance renders for me a constant, 16MB buffer.

@mp911de mp911de changed the title Direct memory leak in lettuce 5.2.x Redis Connections since 5.2.x require more memory Jun 24, 2020
@Shaphan
Copy link
Contributor Author

Shaphan commented Jun 26, 2020

@mp911de Thanks for taking at look at the ticket as well as at the sample program!

No surprise that 2000 workers * Runtime.availableProcessors() cause memory exhaustion.

The sample program is intended to run out of direct memory. That's its goal. The purpose of the program is to show that on lettuce 5.2.x and higher the same code under the same conditions runs out of direct memory ~4 times faster than on pre-5.2.

Our production code (where we started to run into the issue) is, surely, different. It takes ~3-4 days for an instance on lettuce 5.2.x with 4G heap (and equal max direct memory limit) to run out of memory, despite the high throughput of Redis operations. It only creates a single RedisClusterClient per Redis cluster. It also has a lot of complementary code on top of lettuce. There're many other differences between the sample app and the prod app. So the point of the sample app was not to show how we're working with lettuce, but to rather outline the difference in memory usage between the lettuce versions.

Starting the reproducer as-is causes immediately a too many open files failure.
It would surely depend on the operating systems settings. In case increasing the file descriptors limit is not option, please consider running the application with a lower number of workers.

Please let me know if you'd like me to provide any more info / further elaborate on the sample app.

One more point: what would you say about the workaround code?
What if I submitted a one-liner PR allowing bufferUsageRatio to be set to zero here https://github.com/lettuce-io/lettuce-core/blob/b978238db8a33c8860819b73e528f016f940e992/src/main/java/io/lettuce/core/ClientOptions.java#L362 ?

@mp911de
Copy link
Collaborator

mp911de commented Jun 26, 2020

I have the feeling, that no matter what we do, we always find someone that wants to use a different buffer cleanup strategy because of $memory, $cputime, $reasons.

I was considering a callback interface that gets called with the buffer after reach partial and full read and deprecate bufferUsageRatio altogether. We would provide a default implementation that does what bufferUsageRatio is doing now and folks are able to implement their strategy themselves (similar to what we have with Delay).

@mp911de mp911de added type: task A general task and removed status: waiting-for-triage type: bug A general bug labels Jun 26, 2020
Shaphan added a commit to Shaphan/lettuce-core that referenced this issue Jul 17, 2020
…s#1314

(cherry picked from commit 0263309)

# Conflicts:
#	src/main/java/io/lettuce/core/ClientOptions.java
#	src/main/java/io/lettuce/core/cluster/ClusterClientOptions.java
#	src/main/java/io/lettuce/core/protocol/CommandHandler.java
Shaphan added a commit to Shaphan/lettuce-core that referenced this issue Jul 17, 2020
Shaphan added a commit to Shaphan/lettuce-core that referenced this issue Jul 17, 2020
Shaphan added a commit to Shaphan/lettuce-core that referenced this issue Jul 17, 2020
Shaphan added a commit to Shaphan/lettuce-core that referenced this issue Jul 17, 2020
(cherry picked from commit b078fed)
Shaphan added a commit to Shaphan/lettuce-core that referenced this issue Jul 17, 2020
(cherry picked from commit b078fed)
mp911de added a commit that referenced this issue Jul 29, 2020
Rename ReadBytesDiscardPolicy to DecodeBufferPolicy. Introduce DecodeBufferPolicies with factory methods to discard read bytes and to reclaim memory.

Add author and since tags.

Original pull request: #1346.
mp911de added a commit that referenced this issue Jul 29, 2020
Order client resources/ClientOptions alphabetically.
@mp911de mp911de added this to the 6.0 RC1 milestone Jul 29, 2020
@mp911de
Copy link
Collaborator

mp911de commented Jul 29, 2020

Thanks for your pull request. That's merged for Lettuce 6 for now as we typically do not introduce new API with bugfix release. Can you check out Lettuce 6 whether memory behavior can be customized as intended?

@mp911de mp911de closed this as completed Jul 29, 2020
@mp911de mp911de changed the title Redis Connections since 5.2.x require more memory Introduce DecodeBufferPolicy to reduce memory usage Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: task A general task
Projects
None yet
Development

No branches or pull requests

2 participants