You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Consider an asynchronous system with external queue from where requests are sent. The response to this request merely means an ack to the request and not the handling. The server will send a reply related to the request after an arbitrary time which depends on the load on the system. For long response times, say in the order of minutes, keeping the corresponding requests in memory to measure the latency defeats the purpose of async system.
Is it possible to find the concurrency limits in such a system? I assume that we have request rate at which we are sending the requests and response rates that which system is finishing the requests (these can be measured cheaply and have no request/response association needed). In a steady state, request rate and response rate matches. We can increase the request rate until response rate stagnates.
Is there anything we can apply from TCP congestion control (or some other technique) to fine tune the request rate to slowly increase to max capacity and keep tweaking the request rate based on the response rates?
The basic idea is, in a given window:
if request_rate <= response_rate:
// (less is possible since responses are for requests from a prev window)
request_rate = request_rate * (1 + x)
else:
request_rate = request_rate * (1 - y)
Questions:
Does this make sense or there's another way to model or think about limiting?
How are x and y typically measured?
The text was updated successfully, but these errors were encountered:
Consider an asynchronous system with external queue from where requests are sent. The response to this request merely means an ack to the request and not the handling. The server will send a reply related to the request after an arbitrary time which depends on the load on the system. For long response times, say in the order of minutes, keeping the corresponding requests in memory to measure the latency defeats the purpose of async system.
Is it possible to find the concurrency limits in such a system? I assume that we have request rate at which we are sending the requests and response rates that which system is finishing the requests (these can be measured cheaply and have no request/response association needed). In a steady state, request rate and response rate matches. We can increase the request rate until response rate stagnates.
Is there anything we can apply from TCP congestion control (or some other technique) to fine tune the request rate to slowly increase to max capacity and keep tweaking the request rate based on the response rates?
The basic idea is, in a given window:
if request_rate <= response_rate:
// (less is possible since responses are for requests from a prev window)
request_rate = request_rate * (1 + x)
else:
request_rate = request_rate * (1 - y)
Questions:
The text was updated successfully, but these errors were encountered: