-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 5449: Ignore SP and HTAB chars after chunk-size #1914
Conversation
Previously (before squid-cache#1498 change), squid was accepting TE-chunked replies with whitespaces after chunk-size and missing chunk-ext data. After It turned out that replies with such whitespace chars are pretty common and other webservers which can act as forward proxies (e.g. nginx, httpd...) are accepting them. This change will allow to proxy chunked responses from origin server, which had whitespaces inbetween chunk-size and CRLF.
Can one of the admins verify this patch? |
Bug related to this MR - https://bugs.squid-cache.org/show_bug.cgi?id=5449 |
The only thing I'm not sure about this change is, if it is safe to expect that there can be unlimited amount of whitespaces. nginx skips only one '\t' or ' ' and apache httpd allow only 10 occurrences. nginx: https://github.com/nginx/nginx/blob/stable-1.26/src/http/ngx_http_parse.c#L2242C1-L2245C44 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adjusted PR title to refer to the corresponding bug report. I adjusted PR description (i.e. future official commit message body) primarily to fix formatting and to add an XXX that we should try to resolve:
PR description: Replies with such whitespace chars are pretty common (XXX: missing evidence).
@uhliarik, would it be possible to support the above assertion? The problem is clearly not very "common" because it took a year for it to be reported despite the problematic code being present in v5+ releases. The bug report cites what looks like some one-off CGI script, but I suspect you know of some more important/popular use cases that need Squid to be lenient here. Needless to say, Squid should not accept malformed message framing in attack-prone area just to preserve some obscure CGI script bugs or mimic questionable behavior by other proxies...
FWIW, if sufficient evidence of "pretty common" use is provided, I would be inclined to accept a fixed version (see specific change requests) of this PR.
@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok) | |||
// Code becomes much simpler when incremental parsing functions throw on | |||
// bad or insufficient input, like in the code below. TODO: Expand up. | |||
try { | |||
tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size | |||
parseChunkExtensions(tok); // a possibly empty chunk-ext list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed change results in two sequential ParseBws() (or equivalent) calls because parseChunkExtensions() (correctly!) starts with ParseBws() as well. This problem alone is not a showstopper IMO, but we should keep this added complexity and waste of CPU cycles in mind when deciding whether to merge this PR.
It is probably possible to refector code to remove this waste, and it would be fair to require this PR to perform that refactoring, especially if it does not require heroic efforts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, definitelly it would be better to perform only one ParseBws(tok);
call. However as I mentioned above the ParseBws(tok);
call was not working and it caused triggering exception on tok.skipRequired("CRLF after [chunk-ext]", Http1::CrLf());
line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new ParseBws()
logic (unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters. It would be best not to have that logic separated away from the case documentation. see my other comment which should resolve all this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new
ParseBws()
logic ...
This PR does not change ParseBws() logic.
The new .... logic unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters.
That is not really new logic: Squid already allowed multiple WSP characters after chunk-size before the regression (that this PR is fixing) was introduced. Please see PR description for details.
FWIW, AFAICT, httpd stated reason for accepting them is invalid -- they cite RFC errata that does not allow these chars when there is no chunk-ext! That errata resulted in the corresponding changes to subsequent RFCs, but those changes still do not allow |
Yes, I'm aware of that :(. When I was reading that comment and RFC Errata (And I actually saw that the change was proposed by you), I realized that this is related to chunk-ext and not to removing extra white chars. Obviously the comment does not describe the logic of the code, anyway it does not change anything on the fact that httpd skips those whitespaces. But you are right the comment is misleading. |
It is safe to reject any amount of prohibited whitespace. Everything else is unsafe. This PR is trading safety for compatibility. If we allow any space at all, then I would err on the side of code simplicity/reuse -- reject whatever we can (correctly) reject using existing code. I posted a specific suggestion that was meant to do that. That suggestion rejects unlimited amount of spaces (not because it is safe but because we want to reuse code that correctly rejects unwanted spaces). |
I came up with some other changes after our discussion. I added one argument to I also tried to optimize number of ParseBWS calls. In the new commit, Feel free to refactor it or push any other changes. |
Please do not forget to push your changes. I will wait for them to show up before making mine. Thank you. |
Depending on new wsp_only argument in ParseBws() it will be decided which set of whitespaces characters will be parsed. If wsp_only is set to true, only SP and HTAB chars will be parsed. Also optimized number of ParseBws calls.
I pushed everything what I have. This is the latest version which I tested and it worked for me. Feel free to do any changes you want. Thanks a lot for a review Alex. |
Both bugs deal with "chunk-size SP+ CRLF" use cases. Bug 4492 had _two_ spaces after chunk-size, which answers one of the PR review questions: Should we skip just one space? No, we should not. The lines moved around in many commits, but I believe this regression was introduced in commit 951013d because that commit stopped consuming partially parsed chunk-ext sequences. That consumption was wrong, but it had a positive side effect -- fixing Bug 4492...
... to draw code reader attention when something unusual is going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uhliarik, thanks, in part, to your rewrite, I have realized that we are not dealing with a new bug here. We are dealing with a regression of bug 4492! This fact addresses PR description XXX (now gone), and makes it a lot easier to accept these changes. I also polished PR code a little (commit f837f5f).
Please test my changes (as of commit f79936a). If you are happy with them, and there are no objections from others, then I plan to clear this regression-fixing PR for merging.
Jenkins, it is OK to test this PR. |
Thank you very much @uhliarik and @rousskov - I agree the comment in the httpd code is wrong here, I've updated it referencing this discussion. apache/httpd@914eef0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR; The removal of this tolerance was due to a specific CVE issue.
Since you have testing, if you can please check and confirm if this is still an issue with current IBM_HTTP_Server releases. Document it as only being relevant to "IBM_HTTP_Server version X or older" would be perfect to aid future removal.
@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok) | |||
// Code becomes much simpler when incremental parsing functions throw on | |||
// bad or insufficient input, like in the code below. TODO: Expand up. | |||
try { | |||
tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size | |||
parseChunkExtensions(tok); // a possibly empty chunk-ext list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new ParseBws()
logic (unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters. It would be best not to have that logic separated away from the case documentation. see my other comment which should resolve all this.
@@ -125,6 +125,10 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok) | |||
// Code becomes much simpler when incremental parsing functions throw on | |||
// bad or insufficient input, like in the code below. TODO: Expand up. | |||
try { | |||
// Bug 4492: IBM_HTTP_Server sends SP after chunk-size. | |||
// No ParseBws() here because it may consume CR required further below. | |||
ParseStrictBws(tok); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This being a very specific and CVE related case, I see much benefit in avoiding the new special ParseBws()
logic.
Please instead prefer paranoia level of clarity:
ParseStrictBws(tok); | |
// Also, this is not BWS. It is explicit "MUST reject" violation of RFC 9112 HTTP/1.1. | |
(void)tok.skipOne(CharacterSet::WSP); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello Amos. As Alex mentioned in a0d4fe1 , there have been seen two spaces after chunk-size. Suggested change therfor would not fix such case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to what Luboš has said, skipOne() and skipAll() do not work well for incremental parsing of bad/prohibited-by-protocol input, as was already pointed out in the initial review of this code. Please do not go back to the already rejected code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well, skip twice then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very well, skip twice then.
Skipping twice would not address my concern. Moreover, correctly skipping at most twice requires heroic efforts including creation of new parsing state. Simply calling skipOnce() twice would not accomplish that.
Marking for backport, and also our CVE SQUID-2023_1.patch (both v5 and v6) will need these changed appended. So please be minimal. |
We (Red Hat) don't have access to that, @covener do you know if "spaces after chunk-size" this was something fixed in some past version of IHS? |
@rousskov I tested the patch including all your commits in this PR and it's working fine. |
I am the maintainer at IBM and have been working on it in some capacity since 2005. It is possibly an Apache 1.3-based release, which would have been past any extended support 5 years prior to the 2016 bug. I'd suggest a commit message like "Likely affected historical IBM HTTP Server releases like 1.3.x. Contact IBM Support if you are adversely affected by this change." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amos: FTR; The removal of this tolerance was due to a specific CVE issue.
While 2023 commit 951013d changes were triggered by that CVE, this PR does not reintroduce the two specific bugs that CVE was based on AFAICT.
Luboš: I tested the patch including all your commits in this PR and it's working fine.
Thank you very much. Unfortunately, this PR is now blocked by others. It cannot be merged until there are no negative votes. I tried to address Amos' change requests in this review. The ball is on @yadij side.
@@ -125,6 +125,10 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok) | |||
// Code becomes much simpler when incremental parsing functions throw on | |||
// bad or insufficient input, like in the code below. TODO: Expand up. | |||
try { | |||
// Bug 4492: IBM_HTTP_Server sends SP after chunk-size. | |||
// No ParseBws() here because it may consume CR required further below. | |||
ParseStrictBws(tok); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to what Luboš has said, skipOne() and skipAll() do not work well for incremental parsing of bad/prohibited-by-protocol input, as was already pointed out in the initial review of this code. Please do not go back to the already rejected code.
@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok) | |||
// Code becomes much simpler when incremental parsing functions throw on | |||
// bad or insufficient input, like in the code below. TODO: Expand up. | |||
try { | |||
tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size | |||
parseChunkExtensions(tok); // a possibly empty chunk-ext list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new
ParseBws()
logic ...
This PR does not change ParseBws() logic.
The new .... logic unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters.
That is not really new logic: Squid already allowed multiple WSP characters after chunk-size before the regression (that this PR is fixing) was introduced. Please see PR description for details.
Thanks very much @covener. For completeness, in the traces we have from customers currently facing this issue there was no obvious presence of IBM HTTP Server, they all had |
Prior to 2023 commit 951013d, Squid accepted Transfer-Encoding chunks with chunk-size followed by spaces or tabs (before CRLF). This HTTP syntax violation was allowed to address Bug 4492 (fixed in 2017 commit 26f0a35). This change restores that fix functionality. FWIW, our research shows that nginx and httpd also accept similar input.
Prior to 2023 commit 951013d, Squid accepted Transfer-Encoding chunks
with chunk-size followed by spaces or tabs (before CRLF). This HTTP
syntax violation was allowed to address Bug 4492 (fixed in 2017 commit
26f0a35). This change restores that fix functionality. FWIW, our
research shows that nginx and httpd also accept similar input.