-
Notifications
You must be signed in to change notification settings - Fork 986
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow usage of publishOn scheduler for reactive signal emission #905
Comments
@mp911de I'm seeing similar issue with Reactor Netty + Webflux + Reactor + Lettuce. I was doing stress test on my application and if i send a 100 requests to my webserver, I'm seeing the 4 netty threads are busy processing (essentially calling subscribe) to all the requests. Now the single Lettuce thread responsible for fetching data from Redis is not really "getting its turn" until the 4 netty threads call subscribe on almost all of the 100 requests. Lettuce thread gets to process few times during this time, but once the flood of requests are all subscribed, lettuce picks up speed. It kind of seems unfair in the sense that if requests keep coming, there is a significant impact on request processing times. Also in a real prod environment, this is gonna create a huge delay in processing.. |
First test results: Synchronous/Spring Boot/Tomcat
Reactive/Spring Boot/WebFlux
Application under test: https://gist.github.com/mp911de/ba9c46fe8e8bbe4558ca445e996f108a Conclusion: Cannot reproduce significant difference between reactive and synchronous processing times. Higher load (400 concurrent requests) results in a difference of 10ms. |
Please provide a reproducer, otherwise we cannot trace the issue. |
As you asked, I have created a sample project to show the lettuce performance issue, based on your demo application. |
$ wrk -c 4 -d 10 -t 4 http://localhost:8080/fromRedis $ wrk -c 4 -d 10 -t 4 http://localhost:8080 |
Here are my test results with the code repo you provided:
After removing
These numbers seem pretty reasonable to me, 7000 operations per second on a single machine. Not sure why you see a slowdown to 190 operations per second. After I removed the system.out calls, throughput almost doubled. It's not healthy to use sysout as all output is via |
@mp911de What machine are you running it? The most I'm getting is
|
Just a regular MacBook (2,8 GHz Intel Core i7), nothing crazy. |
@mp911de I'm running on a 3.1 GHz i7 MacBook, dual core. Not sure what was causing the slowdown but redoing the test after restarting laptop gave me ~4900 reques/sec
But to follow up on the actual issue, i updated the response size to ~300KB and the req/sec dropped to ~100 which is what I'm seeing in our app servers. Is that expected? I have updated the sample code here: https://github.com/jmnght/lettucetest Results:
Thanks! |
Thanks a lot for the additional details, @jmnght. Increasing the payload size really makes a difference and helps to reveal code that does not scale well. I spotted various things here and tried different things:
I attached my wrk results below. Okay, so what's the conclusion? I think the combination of an expensive serialization mechanism parired with the single-threaded nature of a single connection have a measurable impact. I will do further experiments with multi-threaded scenarios to see how we can improve here. Using Jackson (original code at jmnght/lettucetest@6f2c221)
Write HGETALL content as String:
Write HGETALL content as byte[]:
|
I introduced a I deployed a snapshot build that emits data signals on a multi-threaded scheduler. <dependency>
<groupId>io.lettuce</groupId>
<artifactId>lettuce-core</artifactId>
<version>5.1.4.gh905-SNAPSHOT</version>
</dependency>
<repositories>
<repository>
<id>sonatype-snapshots</id>
<name>Sonatype Snapshot Repository</name>
<url>https://oss.sonatype.org/content/repositories/snapshots/</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories> Emitting data signals directly:
Emitting data signals on a scheduler:
|
@mp911de Thank you for analyzing it. I will try the SNAPSHOT version and report back.
|
Regarding your questions:
|
@mp911de I tried the snapshot and the performance improvements I'm seeing are similar to what you are seeing. Number of reqests/second handled is little more than double of what it is currently. |
Thanks for reporting back. There's no sensible workaround beyond raw consumption of bytes and handling deserialization in your own code and applying |
@mp911de Thank you for adding this to 5.1.4 milestone. For now, I've changed usage to something like: Throughput is close to twice (~400 req/sec) in version 5.1.2, but nowhere close to 3237 req/sec that you were seeing with |
@mp911de , I integrated this change in the actual application and the improvement was much lower than what I saw in the test application. Could you share the code snippet that gave you 3237 req/sec with byte[] ? (or feel free to commit to jmnght/lettucetest repo) |
Emission of data and completion signals using the reactive API can now use Lettuce's EventExecutorGroup instead of the I/O thread. Scheduler usage allows multiplexing of threads and early I/O thread release. Using a single Redis connection in a reactive context can cause effectively single-threaded behavior as processing of downstream operators happens typically on the I/O thread. Compute-intensive processing consumes I/O thread capacity and results in performance decrease.
Emission of data and completion signals using the reactive API can now use Lettuce's EventExecutorGroup instead of the I/O thread. Scheduler usage allows multiplexing of threads and early I/O thread release. Using a single Redis connection in a reactive context can cause effectively single-threaded behavior as processing of downstream operators happens typically on the I/O thread. Compute-intensive processing consumes I/O thread capacity and results in performance decrease.
@jmnght I pushed the change to When using Spring Boot, then you can customize Lettuce via: @Configuration
class LettuceCustomizer implements LettuceClientConfigurationBuilderCustomizer {
@Override
public void customize(LettuceClientConfigurationBuilder clientConfigurationBuilder) {
clientConfigurationBuilder.clientOptions(ClientOptions.builder().publishOnScheduler(true).build())
}
} |
@mp911de Any metrics on performance with the above changes? |
I will not be able to share my code here, but I can confirm that my use case is running 4x faster with the above change. I'm heavily using spring data redis in my project. |
We've seen already metrics measured from an external system perspective. Throughput scales with the number of utilized cores instead of capping throughput at single-thread (single-core) capacity. From a JMH perspective, there's a notable penalty, however users have the choice to enable or to disable publish on scheduler:
I'm closing this ticket as it is resolved now. |
@mp911de RedisClient client1 = RedisClient.create("redis://localhost");
RedisClient client2 = RedisClient.create("redis://localhost");
client1.setOptions(ClientOptions.builder().publishOnScheduler(false).build());
client2.setOptions(ClientOptions.builder().publishOnScheduler(true).build());
RedisReactiveCommands<String, String> redis1 = client1.connect().reactive();
RedisReactiveCommands<String, String> redis2 = client2.connect().reactive();
int counter1 = 0;
int counter2 = 0;
for (int i = 0; i < 1000; i++) {
if (redis1.eval("return 1", INTEGER).next().block() == null) counter1++;
if (redis2.eval("return 1", INTEGER).next().block() == null) counter2++;
}
out.println(counter1);
out.println(counter2); Basically, Also, it would be great to be able to specify our own executor, instead of Lettuce's one. |
@trueinsider Can you please file a ticket for the bug report. Re Executor: You can do so by providing an |
Motivated by https://stackoverflow.com/questions/53001283/reactive-redis-lettuce-always-publishing-to-single-thread, we should investigate (benchmark) thread switching for response processing.
In a reactive/async scenario, where responses are processed on an I/O thread, there may occur congestion and we should create various scenarios to measure impact and how to address this.
The text was updated successfully, but these errors were encountered: