-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: dist sender sends too many RPCs #34999
Comments
@ajwerner might also be interested in this, as it has to do with admission control. We shouldn't make a ton of concurrent RPCs to a remote node when we don't know how loaded that node is. The remote node should have much more control of what it does with a set of incoming requests. |
This seems like a valuable and straightforward change. The distsender already has all the information it might need to split up the batch like this. I'd be happy to take a stab at this. Do you have good workloads in mind that hit this limit? |
All workloads that use index or lookup joins on tables with reasonable numbers of ranges will hit thi slimit. |
After some reflection it's not clear that the concurrency has a huge overhead. The big issue is that hitting the concurrency limit is extremely expensive from a latency perspective (especially in a geo-distributed setting 😱. Sure we're trusting GRPC to be efficient at dealing with lots of requests and we're spawning lots of goroutines on the sender but it's not clear the cost of that in practice. Have you tried bumping |
I'm still going to try to type up the experiment but I worry it's going to have some unintended consequences and is going to further muddle some already pretty gnarly and complex code |
We have marked this issue as stale because it has been inactive for |
The dist sender sends a batch to many ranges at once by chunking the batch up into one per range, and making a potentially asynchronous RPC (or local call, if the range lives on the local node) to each range.
Consider a case where there are 10 ranges per node, and a batch contains requests for all 10 ranges for a remote node. Then, the dist sender will make 10 asynchronous network RPCs all to the same node.
This seems inefficient. If the dist sender instead chunked up a batch by node, instead of by range, and sent a single RPC per node, then we'd have way fewer in-flight RPCs. The receiving node could then be in charge of further concurrency if it chooses. At the moment, the dist sender is the arbiter of concurrency - even for remote nodes, whose load it knows nothing about!
I'm assigning @nvanbenschoten for initial thoughts.
The text was updated successfully, but these errors were encountered: