Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skipper produces 500 "Internal Server Error" HTTP status code on EOF #269

Closed
hjacobs opened this issue Feb 6, 2017 · 10 comments
Closed
Labels
bugfix Bug fixes and patches

Comments

@hjacobs
Copy link
Contributor

hjacobs commented Feb 6, 2017

We observe sporadic issues with Skipper connecting to Kubernetes ClusterIP service. Relevant Skipper log lines:

500:

02:26:40.814 log='[APP]time="2017-02-05T01:26:39Z" level=error msg=EOF'
02:26:40.814 log='52.59.216.130 - - [05/Feb/2017:01:26:39 +0000] "GET / HTTP/1.1" 500 22 "" "zmon-worker/v130" 2 hjacobs-test.example.org'

500:

20:10:14.338 log='[APP]time="2017-02-05T19:10:10Z" level=error msg=EOF'
20:10:14.338 log='52.59.216.130 - - [05/Feb/2017:19:10:10 +0000] "GET / HTTP/1.1" 500 22 "" "zmon-worker/v130" 14 hjacobs-test.example.org'

Expected behavior: Skipper should never produce "Internal Server Error", but instead handle the problem and report an appropriate status code (maybe 503?).

@hjacobs hjacobs added the bugfix Bug fixes and patches label Feb 6, 2017
@aryszka
Copy link
Contributor

aryszka commented Feb 6, 2017

the msg=EOF is weird. At first sight not sure what's the source of it, spotted in other deployments, too. Deploying this could help: https://github.com/zalando/skipper/pull/266/files , can I get +1 there if the PR is ok?

@aryszka
Copy link
Contributor

aryszka commented Feb 6, 2017

@aryszka
Copy link
Contributor

aryszka commented Feb 7, 2017

@aryszka
Copy link
Contributor

aryszka commented Feb 7, 2017

considering the behavior of the code linked above, the issue can be that the connection close on the server doesn't happen immediately, even if it is reported to be closed. In this case the 'write: broken pipe' error makes sense, and the only weird thing is when it reports EOF.

So reporting '503 Service Unavailable' seems legit. Or maybe '502 Bad Gateway'?

@szuecs
Copy link
Member

szuecs commented Feb 7, 2017

I would suggest 502 or 504 https://en.wikipedia.org/wiki/List_of_HTTP_status_codes#5xx_Server_Error

503 would mean that skipper itself is unavailable.

@aryszka
Copy link
Contributor

aryszka commented Feb 7, 2017

For the reported EOF, the related issue in Go net/http is this: golang/go#13667 . It was reintroduced by this: golang/go#16465

Interesting can be also: https://golang.org/src/net/http/response.go#L150

@szuecs
Copy link
Member

szuecs commented Feb 8, 2017

Interesting can be also: https://golang.org/src/net/http/response.go#L150

This is indeed interesting, because it shows that it doesn't expect to get a close() without any write() before.

@aryszka
Copy link
Contributor

aryszka commented Feb 9, 2017

the below two links can be interesting, too. The changeset is "only" one year old.

golang/go#4677
golang/go@5dd372b

what if skipper somehow prevents retries for idempotent requests. Nginx seems to be doing retries by default: https://github.com/kubernetes/ingress/blob/master/controllers/nginx/configuration.md#custom-nginx-upstream-checks

@Raffo
Copy link
Contributor

Raffo commented Aug 3, 2017

Can we close this issue?

@szuecs
Copy link
Member

szuecs commented Mar 12, 2018

We changed a lot in the error handling, I suspect that this is not valid any more.
We also set more explicit status code on different failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Bug fixes and patches
Projects
None yet
Development

No branches or pull requests

4 participants