Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 5449: Ignore SP and HTAB chars after chunk-size #1914

Closed
wants to merge 6 commits into from

Conversation

uhliarik
Copy link
Contributor

@uhliarik uhliarik commented Oct 10, 2024

Prior to 2023 commit 951013d, Squid accepted Transfer-Encoding chunks
with chunk-size followed by spaces or tabs (before CRLF). This HTTP
syntax violation was allowed to address Bug 4492 (fixed in 2017 commit
26f0a35). This change restores that fix functionality. FWIW, our
research shows that nginx and httpd also accept similar input.

Previously (before squid-cache#1498 change), squid was accepting TE-chunked replies
with whitespaces after chunk-size and missing chunk-ext data. After

It turned out that replies with such whitespace chars are pretty
common and other webservers which can act as forward proxies (e.g.
nginx, httpd...) are accepting them.

This change will allow to proxy chunked responses from origin server,
which had whitespaces inbetween chunk-size and CRLF.
@squid-prbot
Copy link
Collaborator

Can one of the admins verify this patch?

@uhliarik
Copy link
Contributor Author

Bug related to this MR - https://bugs.squid-cache.org/show_bug.cgi?id=5449

@uhliarik
Copy link
Contributor Author

uhliarik commented Oct 10, 2024

The only thing I'm not sure about this change is, if it is safe to expect that there can be unlimited amount of whitespaces. nginx skips only one '\t' or ' ' and apache httpd allow only 10 occurrences.

nginx: https://github.com/nginx/nginx/blob/stable-1.26/src/http/ngx_http_parse.c#L2242C1-L2245C44
httpd: https://github.com/apache/httpd/blob/trunk/modules/http/http_filters.c#L191

@squid-anubis squid-anubis added the M-failed-description https://github.com/measurement-factory/anubis#pull-request-labels label Oct 10, 2024
@rousskov rousskov changed the title Ignore whitespace chars after chunk-size Bug 5449: Ignore whitespace chars after chunk-size Oct 10, 2024
@squid-anubis squid-anubis removed the M-failed-description https://github.com/measurement-factory/anubis#pull-request-labels label Oct 10, 2024
Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted PR title to refer to the corresponding bug report. I adjusted PR description (i.e. future official commit message body) primarily to fix formatting and to add an XXX that we should try to resolve:

PR description: Replies with such whitespace chars are pretty common (XXX: missing evidence).

@uhliarik, would it be possible to support the above assertion? The problem is clearly not very "common" because it took a year for it to be reported despite the problematic code being present in v5+ releases. The bug report cites what looks like some one-off CGI script, but I suspect you know of some more important/popular use cases that need Squid to be lenient here. Needless to say, Squid should not accept malformed message framing in attack-prone area just to preserve some obscure CGI script bugs or mimic questionable behavior by other proxies...

FWIW, if sufficient evidence of "pretty common" use is provided, I would be inclined to accept a fixed version (see specific change requests) of this PR.

src/http/one/TeChunkedParser.cc Outdated Show resolved Hide resolved
@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
// Code becomes much simpler when incremental parsing functions throw on
// bad or insufficient input, like in the code below. TODO: Expand up.
try {
tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size
parseChunkExtensions(tok); // a possibly empty chunk-ext list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposed change results in two sequential ParseBws() (or equivalent) calls because parseChunkExtensions() (correctly!) starts with ParseBws() as well. This problem alone is not a showstopper IMO, but we should keep this added complexity and waste of CPU cycles in mind when deciding whether to merge this PR.

It is probably possible to refector code to remove this waste, and it would be fair to require this PR to perform that refactoring, especially if it does not require heroic efforts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitelly it would be better to perform only one ParseBws(tok); call. However as I mentioned above the ParseBws(tok); call was not working and it caused triggering exception on tok.skipRequired("CRLF after [chunk-ext]", Http1::CrLf()); line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ParseBws() logic (unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters. It would be best not to have that logic separated away from the case documentation. see my other comment which should resolve all this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ParseBws() logic ...

This PR does not change ParseBws() logic.

The new .... logic unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters.

That is not really new logic: Squid already allowed multiple WSP characters after chunk-size before the regression (that this PR is fixing) was introduced. Please see PR description for details.

@rousskov
Copy link
Contributor

PR description: Other proxies (e.g., nginx and httpd) are accepting them.

FWIW, AFAICT, httpd stated reason for accepting them is invalid -- they cite RFC errata that does not allow these chars when there is no chunk-ext! That errata resulted in the corresponding changes to subsequent RFCs, but those changes still do not allow chunk-size SP CRLF sequences.

@rousskov rousskov added S-waiting-for-author author action is expected (and usually required) S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Oct 10, 2024
@uhliarik
Copy link
Contributor Author

PR description: Other proxies (e.g., nginx and httpd) are accepting them.

FWIW, AFAICT, httpd stated reason for accepting them is invalid -- they cite RFC errata that does not allow these chars when there is no chunk-ext! That errata resulted in the corresponding changes to subsequent RFCs, but those changes still do not allow chunk-size SP CRLF sequences.

Yes, I'm aware of that :(. When I was reading that comment and RFC Errata (And I actually saw that the change was proposed by you), I realized that this is related to chunk-ext and not to removing extra white chars. Obviously the comment does not describe the logic of the code, anyway it does not change anything on the fact that httpd skips those whitespaces. But you are right the comment is misleading.

@rousskov
Copy link
Contributor

rousskov commented Oct 10, 2024

The only thing I'm not sure about this change is, if it is safe to expect that there can be unlimited amount of whitespaces.

It is safe to reject any amount of prohibited whitespace. Everything else is unsafe. This PR is trading safety for compatibility.

If we allow any space at all, then I would err on the side of code simplicity/reuse -- reject whatever we can (correctly) reject using existing code. I posted a specific suggestion that was meant to do that. That suggestion rejects unlimited amount of spaces (not because it is safe but because we want to reuse code that correctly rejects unwanted spaces).

@uhliarik
Copy link
Contributor Author

I came up with some other changes after our discussion. I added one argument to ParseBWS() function which would either force skipping CharacterSet::WSP if set to true or by default (if the function argument is not set) it would skip Parser::WhitespaceCharacters() which is dependent on relaxed parsing option.

I also tried to optimize number of ParseBWS calls. In the new commit, ParseBWS() should be called only if there was no chunk-ext found, since if chunk-ext is found then the leading whitespaces would be eliminated by the parseChunkExtensions().

Feel free to refactor it or push any other changes.

@rousskov
Copy link
Contributor

Please do not forget to push your changes. I will wait for them to show up before making mine. Thank you.

Depending on new wsp_only argument in ParseBws() it will be decided
which set of whitespaces characters will be parsed. If wsp_only is set
to true, only SP and HTAB chars will be parsed.

Also optimized number of ParseBws calls.
@rousskov rousskov added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Oct 11, 2024
@uhliarik
Copy link
Contributor Author

I pushed everything what I have. This is the latest version which I tested and it worked for me. Feel free to do any changes you want. Thanks a lot for a review Alex.

Both bugs deal with "chunk-size SP+ CRLF" use cases. Bug 4492 had _two_
spaces after chunk-size, which answers one of the PR review questions:
Should we skip just one space? No, we should not.

The lines moved around in many commits, but I believe this regression
was introduced in commit 951013d because that commit stopped consuming
partially parsed chunk-ext sequences. That consumption was wrong, but it
had a positive side effect -- fixing Bug 4492...
... to draw code reader attention when something unusual is going on.
@rousskov rousskov changed the title Bug 5449: Ignore whitespace chars after chunk-size Bug 5449: Ignore SP and HTAB chars after chunk-size Oct 11, 2024
rousskov
rousskov previously approved these changes Oct 11, 2024
Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uhliarik, thanks, in part, to your rewrite, I have realized that we are not dealing with a new bug here. We are dealing with a regression of bug 4492! This fact addresses PR description XXX (now gone), and makes it a lot easier to accept these changes. I also polished PR code a little (commit f837f5f).

Please test my changes (as of commit f79936a). If you are happy with them, and there are no objections from others, then I plan to clear this regression-fixing PR for merging.

@rousskov rousskov added S-waiting-for-more-reviewers needs a reviewer and/or a second opinion and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Oct 11, 2024
@rousskov
Copy link
Contributor

Jenkins, it is OK to test this PR.

@notroj
Copy link

notroj commented Oct 11, 2024

Thank you very much @uhliarik and @rousskov - I agree the comment in the httpd code is wrong here, I've updated it referencing this discussion. apache/httpd@914eef0

Copy link
Contributor

@yadij yadij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR; The removal of this tolerance was due to a specific CVE issue.

Since you have testing, if you can please check and confirm if this is still an issue with current IBM_HTTP_Server releases. Document it as only being relevant to "IBM_HTTP_Server version X or older" would be perfect to aid future removal.

@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
// Code becomes much simpler when incremental parsing functions throw on
// bad or insufficient input, like in the code below. TODO: Expand up.
try {
tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size
parseChunkExtensions(tok); // a possibly empty chunk-ext list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ParseBws() logic (unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters. It would be best not to have that logic separated away from the case documentation. see my other comment which should resolve all this.

@@ -125,6 +125,10 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
// Code becomes much simpler when incremental parsing functions throw on
// bad or insufficient input, like in the code below. TODO: Expand up.
try {
// Bug 4492: IBM_HTTP_Server sends SP after chunk-size.
// No ParseBws() here because it may consume CR required further below.
ParseStrictBws(tok);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This being a very specific and CVE related case, I see much benefit in avoiding the new special ParseBws() logic.

Please instead prefer paranoia level of clarity:

Suggested change
ParseStrictBws(tok);
// Also, this is not BWS. It is explicit "MUST reject" violation of RFC 9112 HTTP/1.1.
(void)tok.skipOne(CharacterSet::WSP);

Copy link
Contributor Author

@uhliarik uhliarik Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Amos. As Alex mentioned in a0d4fe1 , there have been seen two spaces after chunk-size. Suggested change therfor would not fix such case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to what Luboš has said, skipOne() and skipAll() do not work well for incremental parsing of bad/prohibited-by-protocol input, as was already pointed out in the initial review of this code. Please do not go back to the already rejected code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very well, skip twice then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very well, skip twice then.

Skipping twice would not address my concern. Moreover, correctly skipping at most twice requires heroic efforts including creation of new parsing state. Simply calling skipOnce() twice would not accomplish that.

@yadij yadij added backport-to-v6 maintainer has approved these changes for v6 backporting and removed S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Oct 11, 2024
@yadij
Copy link
Contributor

yadij commented Oct 11, 2024

Marking for backport, and also our CVE SQUID-2023_1.patch (both v5 and v6) will need these changed appended. So please be minimal.

@notroj
Copy link

notroj commented Oct 11, 2024

Since you have testing, if you can please check and confirm if this is still an issue with current IBM_HTTP_Server releases. Document it as only being relevant to "IBM_HTTP_Server version X or older" would be perfect to aid future removal.

We (Red Hat) don't have access to that, @covener do you know if "spaces after chunk-size" this was something fixed in some past version of IHS?

@uhliarik
Copy link
Contributor Author

uhliarik commented Oct 11, 2024

@rousskov I tested the patch including all your commits in this PR and it's working fine.

@covener
Copy link

covener commented Oct 11, 2024

Since you have testing, if you can please check and confirm if this is still an issue with current IBM_HTTP_Server releases. Document it as only being relevant to "IBM_HTTP_Server version X or older" would be perfect to aid future removal.

We (Red Hat) don't have access to that, @covener do you know if "spaces after chunk-size" this was something fixed in some past version of IHS?

I am the maintainer at IBM and have been working on it in some capacity since 2005.
It's nothing I recognize, and I don't see any such behavior with a basic CGI in our oldest in-service release. It is not an area where we would intentionally deviate from upstream Apache HTTP Server.

It is possibly an Apache 1.3-based release, which would have been past any extended support 5 years prior to the 2016 bug.

I'd suggest a commit message like "Likely affected historical IBM HTTP Server releases like 1.3.x. Contact IBM Support if you are adversely affected by this change."

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amos: FTR; The removal of this tolerance was due to a specific CVE issue.

While 2023 commit 951013d changes were triggered by that CVE, this PR does not reintroduce the two specific bugs that CVE was based on AFAICT.

Luboš: I tested the patch including all your commits in this PR and it's working fine.

Thank you very much. Unfortunately, this PR is now blocked by others. It cannot be merged until there are no negative votes. I tried to address Amos' change requests in this review. The ball is on @yadij side.

@@ -125,6 +125,10 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
// Code becomes much simpler when incremental parsing functions throw on
// bad or insufficient input, like in the code below. TODO: Expand up.
try {
// Bug 4492: IBM_HTTP_Server sends SP after chunk-size.
// No ParseBws() here because it may consume CR required further below.
ParseStrictBws(tok);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to what Luboš has said, skipOne() and skipAll() do not work well for incremental parsing of bad/prohibited-by-protocol input, as was already pointed out in the initial review of this code. Please do not go back to the already rejected code.

@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
// Code becomes much simpler when incremental parsing functions throw on
// bad or insufficient input, like in the code below. TODO: Expand up.
try {
tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size
parseChunkExtensions(tok); // a possibly empty chunk-ext list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new ParseBws() logic ...

This PR does not change ParseBws() logic.

The new .... logic unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters.

That is not really new logic: Squid already allowed multiple WSP characters after chunk-size before the regression (that this PR is fixing) was introduced. Please see PR description for details.

@rousskov rousskov added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-author author action is expected (and usually required) labels Oct 11, 2024
@rousskov rousskov requested a review from yadij October 11, 2024 13:39
@yadij yadij added M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box backport-to-v6 maintainer has approved these changes for v6 backporting labels Oct 11, 2024
@rousskov rousskov added the S-could-use-an-approval An approval may speed this PR merger (but is not required) label Oct 11, 2024
@notroj
Copy link

notroj commented Oct 11, 2024

Thanks very much @covener. For completeness, in the traces we have from customers currently facing this issue there was no obvious presence of IBM HTTP Server, they all had Server: Apache. The only way this was associated with IHS I'm aware of is via https://bugs.squid-cache.org/show_bug.cgi?id=4492

squid-anubis pushed a commit that referenced this pull request Oct 12, 2024
Prior to 2023 commit 951013d, Squid accepted Transfer-Encoding chunks
with chunk-size followed by spaces or tabs (before CRLF). This HTTP
syntax violation was allowed to address Bug 4492 (fixed in 2017 commit
26f0a35). This change restores that fix functionality. FWIW, our
research shows that nginx and httpd also accept similar input.
@squid-anubis squid-anubis added the M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels label Oct 12, 2024
@rousskov rousskov removed the S-could-use-an-approval An approval may speed this PR merger (but is not required) label Oct 12, 2024
@squid-anubis squid-anubis added M-merged https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels labels Oct 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
M-merged https://github.com/measurement-factory/anubis#pull-request-labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants