-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document that RedisConnection
is not Thread-safe
#2653
Comments
Perhaps you can explain what redisTemplate.executePipelined(connection -> {
objectives.parallelStream().forEach(obj -> {
connection.setCommands().sAdd(rawKey, rawValue);
});
}) For my explanation, I assume (and I am reasonably certain) that:
First of all, the Spring Data Redis
Thread-safety is certainly not the case for Jedis. Under-the-hood, a Spring Data Redis For Lettuce, while the underlying, native connection is Thread-safe, as documented in Spring Data Redis's reference documentation as well as the Lettuce driver docs (see RedisClient and
When a Java stream is parallelized, such as by calling Therefore, a
As a temporary workaround, you could have simply, instead, sequentially processed the elements ( redisTemplate.executePipelined(connection -> {
objectives.forEach(obj -> {
connection.setCommands().sAdd(rawKey, rawValue);
});
}) You are already benefiting from using "Redis pipelining", as documented, anyway. Alternatively, although I am not entirely certain in this case, you may be able to restructure your code by doing: objectives.parallelStream().forEach(obj -> {
redisTemplate.executePipelined(operations -> {
operations.opsForSet().add(rawKey, rawValue);
});
}) In this case, you are calling Also, see the Javadoc for Of course,
In either arrangement, I would strongly encourage the use of pooled connections. Let me know what you think. In the meantime, I will speak to the team about our Thread-safety guarantees with regard to Spring Data Redis connections (i.e. instances of |
Also be aware that Spring Data Redis You should minimally upgrade to Spring Data Redis I would encourage you to upgrade to Spring Data Redis |
NOTE: I edited my original, first comment above. |
@zhaozhiguang - Following up to my previous comments, I have additional feedback for you. First, I wrote integration tests to reproduce and analyze the problem. A few things should stand out here:
The output should look similar to (conveniently formatted for this comment): 2023-08-01T14:42:20.145-07:00 DEBUG 57021 --- [ main] .s.d.r.c.t.AbstractRedisIntegrationTests :
PIPELINE-STREAM THREADS [[
ForkJoinPool.commonPool-worker-1,
ForkJoinPool.commonPool-worker-10,
ForkJoinPool.commonPool-worker-11,
ForkJoinPool.commonPool-worker-2,
ForkJoinPool.commonPool-worker-3,
ForkJoinPool.commonPool-worker-4,
ForkJoinPool.commonPool-worker-5,
ForkJoinPool.commonPool-worker-6,
ForkJoinPool.commonPool-worker-7,
ForkJoinPool.commonPool-worker-8,
ForkJoinPool.commonPool-worker-9,
main
]] You may be surprised to learn that the "sequential" execution outperformed the "parallel" execution in my testing. Of course, this is not unexpected. Parallelization (or premature optimization) does not always lead to performance improvements. It depends on the environment and measurements are always required to optimize correctly, if at all. I'd say, in your case, with only a 300K element data set size, it won't make much of a difference to optimize through the use parallelization. It will also most definitely vary by environment. And, most of the time is likely to be eaten up by network activity, anyway, which can vary from moment to moment depending on the amount of network traffic. In my test run, although local, the difference between sequential and parallel execution on 500,000 elements was not significant, with sequential execution running slightly faster even though it was run first (possibly void of JIT compilation (optimizations in general)): Using Parallel Stream: 10,918 ms Of course, (JVM) optimizations can vary between run invocations as well, thereby affecting measurements. But, on average, there was not a significant, noticeable advantage from 1 approach to the other.
|
Finally, I'd also add that you should NOT use the alternative approach I suggested, which I also captured in a test case. This performed very poorly, and when you trace down (into Spring Data Redis), you realize that you are not really using Redis Pipelined connections effectively since the |
Yes, I have also tested both concurrent and serial modes. Serial performance will be better, because I read Lettuce's official website that Thread safety is, of course, Spring Data Redis is also Thread safety |
Not all classes and components in Spring Data Redis are Thread-safe, as I have discussed above. If Spring Data Redis classes are not explicitly documented as Thread-safe, then it is not safe to assume they are, even if certain underlying classes (e.g. Lettuce driver/library classes) used by Spring Data Redis classes are Thread-safe. If a class or component is not explicitly documented as Thread-safe, then you must assume it is NOT Thread-safe, in fact. For instance, clearly the While So long as the If the template configuration is changed after it is acquired from (or injected by) the Spring container, then it is no longer Thread-safe since it carries non-volatile references to other objects, like
In summary, the "Thread-safety" guarantee of The same rules apply for other classes and components inside the Spring Data Redis framework. |
Two things. In general...
There are many factors for determining the Thread-safety guarantees and what constitutes a "Thread-safe" class. Having said this, I really believe this issue boils down to a documentation change for Spring Data Redis, more than anything else. |
As of right now, we are currently undecided whether changing the CC'ing/Pinging @christophstrobl & @mp911de for further comments. |
… Thread-safety guarantees. Closes spring-projects#2653
RedisConnection
is not Thread-safe
Current Behavior
Use parallelStream for concurrent operations. The amount of data is about 300k. After execution, all results will be added to the pipe. However, this pipe is of the ArrayList type when initializing, so some results are not added to the pipe. Then, NullPointerException will be thrown when traversing the pipe when closePipeline()
Stack trace
Input Code
Input Code
Environment
The text was updated successfully, but these errors were encountered: