-
Notifications
You must be signed in to change notification settings - Fork 499
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
services/horizon: map client disconnects as Status 499 instead of 503 Problem #4098
services/horizon: map client disconnects as Status 499 instead of 503 Problem #4098
Conversation
PR. Removed from "Horizon and SDKs" because there's a corresponding issue there (and this PR is linked). |
…basd on pr review feedback
…lient_cancel_request
…ed unneeded test harness changes
…o fix the reported RACE condition in test
Hello @bartekn , I updated PR per feedback, should be ready for re-review, thanks! |
@@ -55,7 +55,7 @@ func init() { | |||
problem.RegisterError(db2.ErrInvalidOrder, problem.BadRequest) | |||
problem.RegisterError(sse.ErrRateLimited, hProblem.RateLimitExceeded) | |||
problem.RegisterError(context.DeadlineExceeded, hProblem.Timeout) | |||
problem.RegisterError(context.Canceled, hProblem.ServiceUnavailable) | |||
problem.RegisterError(context.Canceled, hProblem.ClientDisconnected) | |||
problem.RegisterError(db.ErrCancelled, hProblem.ServiceUnavailable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think (we need to confirm this) that when context is cancelled while DB query is being executed it's returned as db.ErrCancelled
. Could you check it and, if that's correct, can you also update the returned problem for db.ErrCancelled
in line 59? It's also worth testing if the same error is returned in case of timeout. If that's the case then it will be too complicated to change so let's not add any changes to this PR then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, yes, it looks like db.ErrCancelled is overloaded, i.e. it could represent upstream cancel or a downstream timeout, I can make adjust in session.go to break it out in same fashion done here with check on ctx.Err() and will need a new test for cancel vs. timeout. I found this one article pretty helpful is understanding the postgres timeout vs. cancel context nuances:
https://www.alexedwards.net/blog/how-to-manage-database-timeouts-and-cancellations-in-go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @bartekn , here's included distinction of the db conn timeout vs cancel: 9a808356bff4cab47
…de db layer distinction of timeout vs cancel
…lient_cancel_request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…lient_cancel_request
… for metrics reporting (stellar#4098)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Desculpe _me
Desculpe... |
PR Checklist
PR Structure
otherwise).
services/friendbot
, orall
ordoc
if the changes are broad or impact manypackages.
Thoroughness
.md
files, etc... affected by this change). Take a look in the
docs
folder for a given service,like this one.
Release planning
needed with deprecations, added features, breaking changes, and DB schema changes.
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.
What
Client Disconnects were being mapped by render/problem into ServiceUnavailable(503).
Why
The problem was causing
horizon_http_requests_duration_seconds_count
metric keys to be generated with status=503 label which wasn't accurate, there was no server timeout on the request, rather just the opposite, the client disconnected from the socket.Closes #3710
Known limitations
Client socket disconnect is only detectable if load balancer/proxy being used does immediate close of upstream client connections or no LB and the client is connected directly to server. Detection of client socket disconnect relies on the golang http server detecting the socket close and in turn closing the Done channel on the http request context.