-
-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What to do about no-ops #5182
Comments
Prior work: #3445 I had tried to implement a debug log when requests go unhandled, but I found it non-trivial to implement correctly. |
Ah yes!! Thanks for linking that. I was looking for that but had enough trouble finding the other issue too. Updated original post to link to more issues. |
I love Caddy and appreciate your work, but I really don't understand this logic:
This feels like you're over-thinking it. Caddy received a request for a resource. Caddy was not configured to respond to that request. We were looking for the resource requested in the request. There was no configured response. 404 is the only logical response - no configured resource was found for the request that was made. This is very different from the situation where Caddy is configured to respond to the request (via a file or reverse proxy) and that results in an empty response. That would (pending other config) be an appropriate time to respond with a 200 and an empty response body. The current behavior falsely indicates that Caddy was explicitly configured to respond to the request with an empty response - that is, that it successfully found an empty resource, rather than not finding any resource and sending an empty resource by default. |
Looking at it from another perspective that you discuss in the wiki post:
This fundamentally conflates the server - Caddy - with the resource (which, in this case, Caddy doesn't have enough information to find). A web server's contract is to serve resources. "404 Not Found" means the server was unable to find (or supply) the requested resource. From the perspective of the server - Caddy - 404 is not an error; it is an indication that the server successfully understood and processed the request, but did not find anything for the resource requested. Not being able to understand (501) or process (422) the request would be a different state represented by a different status. The server not being configured to do anything with a request is the literal definition of a 404. |
Hey @amadsen Thanks for the comments!
Maybe; or maybe not. Not all requests are for resources. Most non-GET methods are not for resources, for example.
Yes, so this whole argument makes sense from an application layer perspective. However, as a plain HTTP server, Caddy can't make assumptions about application semantics.
Ok, I see the confusion here. Again, what you're saying makes sense from an application perspective. Actually, a web server's contract is to connect HTTP with an application (some sort of handler that does something with a request). The file server serves files, the reverse proxy gets a response from a backend, even a simple "static_response" handler writes a hard-coded response, etc. If there is no application configured to handle an HTTP request, the default response is 200 OK, meaning that "HTTP is working, yes" -- there's just no application value to the response, so it's empty. I know this is different from what you're used to.
404 means "that the origin server did not find a current representation for the target resource or is not willing to disclose that one exists." The linked section about target resources talks about application logic -- not something that the bare-bones HTTP server can do. 200 means, "I got an HTTP request and handled it according to my configuration correctly," and an empty 200 probably means, "I was not configured to do anything, so here's nothing. I have no application logic." Writing application logic into a vanilla HTTP server would be a mistake.
Remember, it wasn't even looking for a resource -- it just did the HTTP successfully. Think of it kind of like a 0 value. It's not null (because the server IS working), but it's not non-zero (it wasn't configured to do anything). It's just the default value. |
HTTP relies on URIs - Uniform Resource Indicators - to indicate a requested resource, whether that resource is a file or an application behavior. Methods act on those URIs. A web server's job is to connect the request with the underlying resource. It is, by definition, a middleware. The underlying resource may not even speak HTTP - such as a file system, file server, or [Fast]CGI application - and should not be relied upon to provide HTTP semantics directly; that is the job of the web server.
It is impossible to have an HTTP request that doesn't refer to a resource (URI). HTTP - the protocol - may have been successfully "done", but the resource was not found (because it wasn't defined) so the request was not successful - it was not found. The migration of web server (HTTP) semantics directly in to applications ("services") is a relatively modern phenomenon. I don't have a problem with it, but it contributes to a lot of confusion that can occur because of equivocation in the term "application". According to the OCI network model all of HTTP operates at the "application" layer. Most "web applications" have for many years been composed of multiple layers of executables - web servers, script engines, databases, "services", etc. - which may or may not be referred to as an "application" in a given context. Because Caddy is speaking HTTP - and may be the only executable that does so for a given request (as would be the case when lacking configuration for a requested URI) - it is entirely appropriate for it to provide an HTTP response code indicating that it couldn't find anything for the requested resource. Conversely - for comparison sake - haproxy in tcp mode is not speaking HTTP and therefore it would be inappropriate for it to provide an HTTP response. I think it is useful to keep in mind that these specifications were developed in tandem with early web servers (and web browsers), with decades of opportunity to adjust both the specifications and the servers. If your interpretation of the specifications is highly discordant with the behavior of those servers is a strong signal that you might not be reading them as intended. I strongly dislike the default of responding with a 200 response code and an empty response body. In my opinion it violates the principle of least surprise, both by my reading of the specifications and precedent. I think it should be changed to a 404. However, I highly respect the thought and work that has been put in to create Caddy (and recognize it wasn't done by me). While I hope my perspective is persuasive and useful, I respect you and encourage you and the team to implement as you see fit. |
Another way of stating my argument is the configuration is (a layer of) application logic defining the HTTP resource and a lack of configuration specifically means no resource was found. |
Also, I have run in to the situation where a load balancer or other intermediate http server unexpectedly responds with a 404 (or 500, or 200) and therefore violates client expectations (perhaps resulting in blown error budgets and/or difficult to debug situations). From that perspective, I very much appreciate Caddy's effort to make as few assumptions as possible. |
HTTP without an application is a no-op. There's nothing to do, whether back then or today. So, what is the default HTTP response? (Nothing official defines one AFAIK.) An empty response to let clients know the HTTP server is working seems most reasonable to me. I'm sorry it's confusing, but I do think other answers here are just not as "correct". I think the best solution will be better troubleshooting tools... like tracing, more helpful logs, etc.
That assumes Caddy is one monolithic, single-purpose application, when it's actually a JSON API and CLI for HTTP (and other things too, such as TLS, which is irrelevant here).
I think they were more near-sighted, personally. (I don't blame them, I don't think they understood the future like it actually is today.) Who wants a general-purpose web server that doesn't do nothing by default? If we did something by default, you'd (or other people would) be just as confused and frustrated because the server is doing something it's not configured to do. I think the fact that there is no official, clearly-defined "default" HTTP response for a "working, but unconfigured" server is a pretty good sign that the spec writers did not have the foresight of modern systems. Again, I don't blame them -- but I do still think the "0-value" server behavior is most correct.
Ah, right -- so like what I was saying above in this reply before I saw this.
Pick one 🙃 (I've had too many bad experiences with the second.) |
I should mention, after discussing with Francis in Slack, that your arguments are compelling -- I think it just comes down to "we see the Web differently." 🤷♂️ But they are well-reasoned, well-cited arguments that do make sense from a certain point of view. I think there's just ambiguities between theory and practice (spec and implementation) especially over long periods of time (6+ months, heh) and that's what we're running into here. I appreciate the content and manner of your discussion 👍 |
I agree that this is a "we see the web differently" situation and will gladly accept that as reasonable. I very much appreciate you taking the time to consider my perspective - in addition to the amazing work you're doing in general on Caddy. When I consider some of the annoying debugging experiences that I've had with intermediate http servers, I can see where you are coming from better. I still prefer 404 as a better zero-value / default (I think it is one of 404's many intended uses - where "many" is a potential problem), but agree that either situation can be confusing. Thanks again! |
There must be something I'm missing here, because after skimming this and related issues it seems to me that the discussion is happening at the wrong abstraction level.
I don't think there can even be a "default response" because at the spec level there is no such thing as an unhandled request (defined as "I, the server, won't even look at it, but otherwise I'm working just fine"). Once you agree to speak HTTP, you must reply in some fashion to every request that follows the spec. Which hopefully shows that the discussion is really about how to configure routes (ie in the Now, the user should either have a catch-all So: either strictly forbid unhandled routes in the config (as in,
(note how it'll still go through (my 2c: keeping And in closure I must give a huge thank you to y'all for this incredible piece of software. Been using caddy since the v1 days and I'm still amazed at how it keeps getting better (there have been a couple of occasions where I said "dang, I need feature X", went to check the release notes, and lo and behold feature X was in the latest beta! Just incredible) |
@lowne Thank you for this thoughtful reply -- I think it makes a lot of sense. And thanks for your nice comments about the project 😊 I agree we can't really require the user to configure a handler for all possible routes. Sending unhandled requests through an error chain is interesting, but the status code is up for debate, as it's unclear whether the server is misconfigured or the request was misfired. Either way, we're back to the question of what is correct. I think at this point I do recommend that if you want a specific way of handling no-ops, that you simply enscribe that into your config: It's becoming clear that this isn't a decision Caddy should make for everyone, as we all see it differently, and it's best left up to the user to decide. |
After a year and a half, the discussion consensus seems to be... that there isn't one 😅 I appreciate everyone's kindness and professionalism in discussing the matter. There are compelling arguments both ways. For now I've decided not to change any behavior or semantics. But I did decide to slightly adjust the access log message when a request reached the emptyHandler at the end of a chain (i.e. was not handled explicitly). The message will now be "NOP" instead of "handled request" for those requests. (I took a slightly different approach than Francis did, but I learned from that closed PR so I recognize the contribution there.) |
(Oops, the linked commit is only half the solution for some reason. See 399186a for the second half.) |
after spending over a week of debugging what the issue was, just adding the following keywords incase some one comes searching like I did, |
I just wrote a new article for our wiki: https://caddy.community/t/why-caddy-emits-empty-200-ok-responses-by-default/17634
It explains why Caddy does what it does for "no-op" requests; that is, why Caddy emits 200 OK even when it wasn't configured to do anything.
This issue is here to discuss one more time whether that behavior could be improved upon. I recently heard a use case from a contact within a company exploring Caddy that the 200 behavior was surprising and made it difficult to troubleshoot whether the request was being handled partially or not at all (i.e. what routes was it taking that it ended up as a no-op?). Misconfigured routes -- maybe matchers that don't match what is expected -- or missing handlers can cause confusion.
I'd be open to discussing a non-standard 2xx status code to make it more obvious that the server is working, but lacks configuration to invoke an application or originate content. For example,
290 NOP
? I dunno.I don't love this because clients won't know what to do with it. Some clients just look at the first digit to get the gist of what happened. Others expect a specific 20x. Who knows what this would break.
For reasons stated in the wiki article above, I'm not inclined to change this behavior.
Personally, I think a better solution than changing the status code is to provide better config debugging tools.
The latter might not be too hard to get something simple working, so I'll push a branch later with my tinkering.
Feedback welcome in the meantime.
Prior work/discussion in:
The text was updated successfully, but these errors were encountered: