-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we use HTTP 503 response for throttling? #26
Comments
Also, this is a slightly different reason why being throttled, but should we also mention |
I could never figure out what is the use case for 429 when 503 achieves exactly the same from client's perspective. 429 is also supposed to return exactly the same Retry-After header. The slight nuance seems to be that 503 means server is overloaded and unable to respond properly, while 429 says that it could respond but will not because the client is exceed some sort of quota. However, I can't think of anything that the Agent could be doing differently in response to 503 and 429, so I don't know what we gain by adding both to the spec. Perhaps the benefit is in helping the troubleshooting from the Agent side (the Agent can log the response code)? |
I experimented a bit with 503 responses in Go and JS. In Go Websocket client (WSGorilla) the response code is clearly accessible, so no problem here, it can be used to indicate throttling. In JS the 503 response results in an "error" event. This event is indistinguishable from a connection error when the server is down. This seems to be OK, since the protocol defines that if the connection cannot be established exponential backoff should be used, which essentially almost is what we also want to happen for 503 responses (minus the ability to specify exact retry interval). So, for JS it also seems to be OK to use this approach. We do not have another good way to convey the unavailability of the server, unless we mandate the server to accept the Websocket connection and then send a The only alternate I can think of is just get rid of 503 response altogether and say that when overloaded the server simply should reject connections. But this would reduce the troubleshootability for other language implementations where the response code is easily accessible. Given the above I think we should keep 503 response for throttling, since it does not cause problem for JS, is useful for troubleshooting compared to rejecting connection, and is less expensive than the Note that we still do need Any thoughts? |
Coming back to 429. This response is typically tied to a particular client or group of clients. It may not be possible to meaningfully calculate the rate and send a 429 response before the Agent's instance uid is known. That happens after the HTTP response headers are sent, and by then it is too late to try to send 429 response back. So, in practice it may be impossible to have per-Agent rate limiting and 429 responses tied to that. We may still be able to tie this to auth information which supposedly is available before the WebSocket is established. This auth is another open issue that we need to answer before we can be sure about this. |
Yeah, I believe that 503 is more universal and handling it would be enough. I am just wondering if the specification should be prescriptive whether 429 could be used or not. I imagine that in certain environments there might be some 429 throttling capabilities available already and having agent supporting that would be helpful, especially since the agent would behave the same way as for 503 |
So perhaps we can just say that either 429 or 503 SHOULD be used for throttling and from protocol perspective there is no difference (the Agent SHOULD be ready to handle both, the Server can choose to use one or the other or both as applicable). |
Yes, that's exactly what I had on mind. I can prepare a quick PR with that change For WebSockets, I asked @kkruk-sumo for consultation (we used to use that technology extensively in one of the previous projects), maybe he will have some suggestions here |
Please do. |
Contributes to open-telemetry#26
I think #37 resolves this. Closing, can reopen if needed. |
Unfortunately, JS needs a special treatment here and if it's needed to be supported, I would consider to:
|
@kkruk-sumo please clarify what you think will not work for JS with the spec as is. Returning 503 to JS results in a connection "error" event in the browser. The spec already defines what the client needs to do when to do when there is a connection error (exponential backoff). I think that should be sufficient for JS Agents. If you think otherwise please tell. |
@tigrannajaryan I believe that "The minimum recommended retry interval is 30 seconds." from the spec is enough for JS. |
The spec says that the server can respond with 503 to throttle the client.
However, reading the http response code may not be possible with some WebSocket clients, e.g. for browsers (it is disabled for security reasons).
Should we still support this throttling method or avoid it?
The text was updated successfully, but these errors were encountered: