You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For some reason, the parallel Scrabble benchmark performs poorly when the parallelism level is 10+, for example, on my i7 8700 CPU (6 cores/12 threads):
However, my older i7 4770K processor (4 cores/8 threads) shows no such performance degradation. Neither does the reactive-streams-commons implementation (the parent of RxJava's parallel implementation) with parallelism=12.
Correction: The Rsc benchmark was pinned to 8 threads and actually shows a similar inefficiency with 10+.
The text was updated successfully, but these errors were encountered:
I did a different implementation but the degradation isn't gone, just reduced:
With the new code organization, the performance is slightly worse at P=1 and P=6 and somewhat better at higher Ps. The others are likely within the noise limit.
I'm starting to think the underlying issue is that one thread simply can't drive that many rails that fast, thus the round-robin dispatching will result in a high volume of scheduling activity (also hinted by Java Flight Recorder).
If I implement batch-dispatching, the the scheduling overhead appears to be mostly eliminated:
akarnokd
changed the title
3.x: parallel and/or p-reduce performs poorly with 10+ parallelism
3.x: parallel performs poorly with 10+ parallelism
Mar 11, 2020
you have consider lot of aspects while making parallel calls.
one request want to make 10 parallel calls means and your server supports only 12 threads, what about the second request, it will wait releasing of threads from first request.
you have check back all the 12 threads are allocated to your program.
For some reason, the parallel Scrabble benchmark performs poorly when the parallelism level is 10+, for example, on my i7 8700 CPU (6 cores/12 threads):
However, my older i7 4770K processor (4 cores/8 threads) shows no such performance degradation.
Neither does the reactive-streams-commons implementation (the parent of RxJava's parallel implementation) with parallelism=12.Correction: The Rsc benchmark was pinned to 8 threads and actually shows a similar inefficiency with 10+.
The text was updated successfully, but these errors were encountered: