Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
stub: optimize ThreadlessExecutor used for blocking calls
The `ThreadlessExecutor` currently used for blocking calls uses `LinkedBlockingQueue` which is relatively heavy both in terms of allocations and synchronization overhead (e.g. when compared to `ConcurrentLinkedQueue`). It accounts for ~10% of allocations and ~5% of allocated bytes per-call in the `TransportBenchmark` when using in-process transport with [stats and tracing disabled](#5510). Changing to use a `ConcurrentLinkedQueue` results in a ~5% speedup of that benchmark. Before: ``` Benchmark (direct) (transport) Mode Cnt Score Error Units TransportBenchmark.unaryCall1024 true INPROCESS avgt 60 1877.339 ± 46.309 ns/op TransportBenchmark.unaryCall1024 false INPROCESS avgt 60 12680.525 ± 208.684 ns/op ``` After: ``` Benchmark (direct) (transport) Mode Cnt Score Error Units TransportBenchmark.unaryCall1024 true INPROCESS avgt 60 1779.188 ± 36.769 ns/op TransportBenchmark.unaryCall1024 false INPROCESS avgt 60 12532.470 ± 238.271 ns/op ```
- Loading branch information