Bug 5449: Ignore SP and HTAB chars after chunk-size #1914

uhliarik · 2024-10-10T17:38:27Z

Prior to 2023 commit 951013d, Squid accepted Transfer-Encoding chunks
with chunk-size followed by spaces or tabs (before CRLF). This HTTP
syntax violation was allowed to address Bug 4492 (fixed in 2017 commit
26f0a35). This change restores that fix functionality. FWIW, our
research shows that nginx and httpd also accept similar input.

Previously (before squid-cache#1498 change), squid was accepting TE-chunked replies with whitespaces after chunk-size and missing chunk-ext data. After It turned out that replies with such whitespace chars are pretty common and other webservers which can act as forward proxies (e.g. nginx, httpd...) are accepting them. This change will allow to proxy chunked responses from origin server, which had whitespaces inbetween chunk-size and CRLF.

squid-prbot · 2024-10-10T17:38:30Z

Can one of the admins verify this patch?

uhliarik · 2024-10-10T17:41:28Z

Bug related to this MR - https://bugs.squid-cache.org/show_bug.cgi?id=5449

uhliarik · 2024-10-10T17:43:35Z

The only thing I'm not sure about this change is, if it is safe to expect that there can be unlimited amount of whitespaces. nginx skips only one '\t' or ' ' and apache httpd allow only 10 occurrences.

nginx: https://github.com/nginx/nginx/blob/stable-1.26/src/http/ngx_http_parse.c#L2242C1-L2245C44
httpd: https://github.com/apache/httpd/blob/trunk/modules/http/http_filters.c#L191

rousskov

I adjusted PR title to refer to the corresponding bug report. I adjusted PR description (i.e. future official commit message body) primarily to fix formatting and to add an XXX that we should try to resolve:

PR description: Replies with such whitespace chars are pretty common (XXX: missing evidence).

@uhliarik, would it be possible to support the above assertion? The problem is clearly not very "common" because it took a year for it to be reported despite the problematic code being present in v5+ releases. The bug report cites what looks like some one-off CGI script, but I suspect you know of some more important/popular use cases that need Squid to be lenient here. Needless to say, Squid should not accept malformed message framing in attack-prone area just to preserve some obscure CGI script bugs or mimic questionable behavior by other proxies...

FWIW, if sufficient evidence of "pretty common" use is provided, I would be inclined to accept a fixed version (see specific change requests) of this PR.

src/http/one/TeChunkedParser.cc

rousskov · 2024-10-10T18:15:05Z

src/http/one/TeChunkedParser.cc

@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
    // Code becomes much simpler when incremental parsing functions throw on
    // bad or insufficient input, like in the code below. TODO: Expand up.
    try {
+        tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size
        parseChunkExtensions(tok); // a possibly empty chunk-ext list


The proposed change results in two sequential ParseBws() (or equivalent) calls because parseChunkExtensions() (correctly!) starts with ParseBws() as well. This problem alone is not a showstopper IMO, but we should keep this added complexity and waste of CPU cycles in mind when deciding whether to merge this PR.

It is probably possible to refector code to remove this waste, and it would be fair to require this PR to perform that refactoring, especially if it does not require heroic efforts.

Yes, definitelly it would be better to perform only one ParseBws(tok); call. However as I mentioned above the ParseBws(tok); call was not working and it caused triggering exception on tok.skipRequired("CRLF after [chunk-ext]", Http1::CrLf()); line.

The new ParseBws() logic (unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters. It would be best not to have that logic separated away from the case documentation. see my other comment which should resolve all this.

The new ParseBws() logic ...

This PR does not change ParseBws() logic.

The new .... logic unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters.

That is not really new logic: Squid already allowed multiple WSP characters after chunk-size before the regression (that this PR is fixing) was introduced. Please see PR description for details.

rousskov · 2024-10-10T18:53:46Z

PR description: Other proxies (e.g., nginx and httpd) are accepting them.

FWIW, AFAICT, httpd stated reason for accepting them is invalid -- they cite RFC errata that does not allow these chars when there is no chunk-ext! That errata resulted in the corresponding changes to subsequent RFCs, but those changes still do not allow chunk-size SP CRLF sequences.

uhliarik · 2024-10-10T20:10:03Z

PR description: Other proxies (e.g., nginx and httpd) are accepting them.

FWIW, AFAICT, httpd stated reason for accepting them is invalid -- they cite RFC errata that does not allow these chars when there is no chunk-ext! That errata resulted in the corresponding changes to subsequent RFCs, but those changes still do not allow chunk-size SP CRLF sequences.

Yes, I'm aware of that :(. When I was reading that comment and RFC Errata (And I actually saw that the change was proposed by you), I realized that this is related to chunk-ext and not to removing extra white chars. Obviously the comment does not describe the logic of the code, anyway it does not change anything on the fact that httpd skips those whitespaces. But you are right the comment is misleading.

rousskov · 2024-10-10T20:10:16Z

The only thing I'm not sure about this change is, if it is safe to expect that there can be unlimited amount of whitespaces.

It is safe to reject any amount of prohibited whitespace. Everything else is unsafe. This PR is trading safety for compatibility.

If we allow any space at all, then I would err on the side of code simplicity/reuse -- reject whatever we can (correctly) reject using existing code. I posted a specific suggestion that was meant to do that. That suggestion rejects unlimited amount of spaces (not because it is safe but because we want to reuse code that correctly rejects unwanted spaces).

uhliarik · 2024-10-11T00:05:56Z

I came up with some other changes after our discussion. I added one argument to ParseBWS() function which would either force skipping CharacterSet::WSP if set to true or by default (if the function argument is not set) it would skip Parser::WhitespaceCharacters() which is dependent on relaxed parsing option.

I also tried to optimize number of ParseBWS calls. In the new commit, ParseBWS() should be called only if there was no chunk-ext found, since if chunk-ext is found then the leading whitespaces would be eliminated by the parseChunkExtensions().

Feel free to refactor it or push any other changes.

rousskov · 2024-10-11T00:22:42Z

Please do not forget to push your changes. I will wait for them to show up before making mine. Thank you.

Depending on new wsp_only argument in ParseBws() it will be decided which set of whitespaces characters will be parsed. If wsp_only is set to true, only SP and HTAB chars will be parsed. Also optimized number of ParseBws calls.

uhliarik · 2024-10-11T00:58:00Z

I pushed everything what I have. This is the latest version which I tested and it worked for me. Feel free to do any changes you want. Thanks a lot for a review Alex.

Both bugs deal with "chunk-size SP+ CRLF" use cases. Bug 4492 had _two_ spaces after chunk-size, which answers one of the PR review questions: Should we skip just one space? No, we should not. The lines moved around in many commits, but I believe this regression was introduced in commit 951013d because that commit stopped consuming partially parsed chunk-ext sequences. That consumption was wrong, but it had a positive side effect -- fixing Bug 4492...

... to draw code reader attention when something unusual is going on.

rousskov

@uhliarik, thanks, in part, to your rewrite, I have realized that we are not dealing with a new bug here. We are dealing with a regression of bug 4492! This fact addresses PR description XXX (now gone), and makes it a lot easier to accept these changes. I also polished PR code a little (commit f837f5f).

Please test my changes (as of commit f79936a). If you are happy with them, and there are no objections from others, then I plan to clear this regression-fixing PR for merging.

rousskov · 2024-10-11T03:33:18Z

Jenkins, it is OK to test this PR.

notroj · 2024-10-11T07:33:46Z

Thank you very much @uhliarik and @rousskov - I agree the comment in the httpd code is wrong here, I've updated it referencing this discussion. apache/httpd@914eef0

yadij

FTR; The removal of this tolerance was due to a specific CVE issue.

Since you have testing, if you can please check and confirm if this is still an issue with current IBM_HTTP_Server releases. Document it as only being relevant to "IBM_HTTP_Server version X or older" would be perfect to aid future removal.

yadij · 2024-10-11T09:07:35Z

src/http/one/TeChunkedParser.cc

@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
    // Code becomes much simpler when incremental parsing functions throw on
    // bad or insufficient input, like in the code below. TODO: Expand up.
    try {
+        tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size
        parseChunkExtensions(tok); // a possibly empty chunk-ext list


The new ParseBws() logic (unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters. It would be best not to have that logic separated away from the case documentation. see my other comment which should resolve all this.

yadij · 2024-10-11T09:15:19Z

src/http/one/TeChunkedParser.cc

@@ -125,6 +125,10 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
    // Code becomes much simpler when incremental parsing functions throw on
    // bad or insufficient input, like in the code below. TODO: Expand up.
    try {
+        // Bug 4492: IBM_HTTP_Server sends SP after chunk-size.
+        // No ParseBws() here because it may consume CR required further below.
+        ParseStrictBws(tok);


This being a very specific and CVE related case, I see much benefit in avoiding the new special ParseBws() logic.

Please instead prefer paranoia level of clarity:

Suggested change

ParseStrictBws(tok);

// Also, this is not BWS. It is explicit "MUST reject" violation of RFC 9112 HTTP/1.1.

(void)tok.skipOne(CharacterSet::WSP);

Hello Amos. As Alex mentioned in a0d4fe1 , there have been seen two spaces after chunk-size. Suggested change therfor would not fix such case.

In addition to what Luboš has said, skipOne() and skipAll() do not work well for incremental parsing of bad/prohibited-by-protocol input, as was already pointed out in the initial review of this code. Please do not go back to the already rejected code.

Very well, skip twice then.

Very well, skip twice then.

Skipping twice would not address my concern. Moreover, correctly skipping at most twice requires heroic efforts including creation of new parsing state. Simply calling skipOnce() twice would not accomplish that.

yadij · 2024-10-11T09:22:49Z

Marking for backport, and also our CVE SQUID-2023_1.patch (both v5 and v6) will need these changed appended. So please be minimal.

notroj · 2024-10-11T10:47:35Z

Since you have testing, if you can please check and confirm if this is still an issue with current IBM_HTTP_Server releases. Document it as only being relevant to "IBM_HTTP_Server version X or older" would be perfect to aid future removal.

We (Red Hat) don't have access to that, @covener do you know if "spaces after chunk-size" this was something fixed in some past version of IHS?

uhliarik · 2024-10-11T11:51:07Z

@rousskov I tested the patch including all your commits in this PR and it's working fine.

covener · 2024-10-11T12:54:24Z

Since you have testing, if you can please check and confirm if this is still an issue with current IBM_HTTP_Server releases. Document it as only being relevant to "IBM_HTTP_Server version X or older" would be perfect to aid future removal.

We (Red Hat) don't have access to that, @covener do you know if "spaces after chunk-size" this was something fixed in some past version of IHS?

I am the maintainer at IBM and have been working on it in some capacity since 2005.
It's nothing I recognize, and I don't see any such behavior with a basic CGI in our oldest in-service release. It is not an area where we would intentionally deviate from upstream Apache HTTP Server.

It is possibly an Apache 1.3-based release, which would have been past any extended support 5 years prior to the 2016 bug.

I'd suggest a commit message like "Likely affected historical IBM HTTP Server releases like 1.3.x. Contact IBM Support if you are adversely affected by this change."

rousskov

Amos: FTR; The removal of this tolerance was due to a specific CVE issue.

While 2023 commit 951013d changes were triggered by that CVE, this PR does not reintroduce the two specific bugs that CVE was based on AFAICT.

Luboš: I tested the patch including all your commits in this PR and it's working fine.

Thank you very much. Unfortunately, this PR is now blocked by others. It cannot be merged until there are no negative votes. I tried to address Amos' change requests in this review. The ball is on @yadij side.

rousskov · 2024-10-11T13:20:18Z

src/http/one/TeChunkedParser.cc

@@ -125,6 +125,10 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
    // Code becomes much simpler when incremental parsing functions throw on
    // bad or insufficient input, like in the code below. TODO: Expand up.
    try {
+        // Bug 4492: IBM_HTTP_Server sends SP after chunk-size.
+        // No ParseBws() here because it may consume CR required further below.
+        ParseStrictBws(tok);


In addition to what Luboš has said, skipOne() and skipAll() do not work well for incremental parsing of bad/prohibited-by-protocol input, as was already pointed out in the initial review of this code. Please do not go back to the already rejected code.

rousskov · 2024-10-11T13:26:28Z

src/http/one/TeChunkedParser.cc

@@ -125,6 +125,7 @@ Http::One::TeChunkedParser::parseChunkMetadataSuffix(Tokenizer &tok)
    // Code becomes much simpler when incremental parsing functions throw on
    // bad or insufficient input, like in the code below. TODO: Expand up.
    try {
+        tok.skipAll(CharacterSet::WSP); // Some servers send SP/TAB after chunk-size
        parseChunkExtensions(tok); // a possibly empty chunk-ext list


The new ParseBws() logic ...

This PR does not change ParseBws() logic.

The new .... logic unintentionally?) adds a new difference from other servers ... support for sequences of multiple WSP characters.

That is not really new logic: Squid already allowed multiple WSP characters after chunk-size before the regression (that this PR is fixing) was introduced. Please see PR description for details.

notroj · 2024-10-11T16:15:18Z

Thanks very much @covener. For completeness, in the traces we have from customers currently facing this issue there was no obvious presence of IBM HTTP Server, they all had Server: Apache. The only way this was associated with IHS I'm aware of is via https://bugs.squid-cache.org/show_bug.cgi?id=4492

Prior to 2023 commit 951013d, Squid accepted Transfer-Encoding chunks with chunk-size followed by spaces or tabs (before CRLF). This HTTP syntax violation was allowed to address Bug 4492 (fixed in 2017 commit 26f0a35). This change restores that fix functionality. FWIW, our research shows that nginx and httpd also accept similar input.

squid-anubis added the M-failed-description https://github.com/measurement-factory/anubis#pull-request-labels label Oct 10, 2024

rousskov changed the title ~~Ignore whitespace chars after chunk-size~~ Bug 5449: Ignore whitespace chars after chunk-size Oct 10, 2024

squid-anubis removed the M-failed-description https://github.com/measurement-factory/anubis#pull-request-labels label Oct 10, 2024

rousskov requested changes Oct 10, 2024

View reviewed changes

rousskov added S-waiting-for-author author action is expected (and usually required) S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Oct 10, 2024

Added new argument to Http::One::ParseBws()

9c8d35f

Depending on new wsp_only argument in ParseBws() it will be decided which set of whitespaces characters will be parsed. If wsp_only is set to true, only SP and HTAB chars will be parsed. Also optimized number of ParseBws calls.

uhliarik force-pushed the bug5449 branch from 0174f3b to 9c8d35f Compare October 11, 2024 00:35

Fix typo in Parser.h

81e67f9

rousskov added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Oct 11, 2024

rousskov added 2 commits October 10, 2024 22:39

fixup: Fewer conditionals/ifs and more explicit spelling

f837f5f

... to draw code reader attention when something unusual is going on.

rousskov changed the title ~~Bug 5449: Ignore whitespace chars after chunk-size~~ Bug 5449: Ignore SP and HTAB chars after chunk-size Oct 11, 2024

rousskov previously approved these changes Oct 11, 2024

View reviewed changes

fixup: Deadly typo

f79936a

rousskov dismissed their stale review via f79936a October 11, 2024 03:31

rousskov approved these changes Oct 11, 2024

View reviewed changes

rousskov added S-waiting-for-more-reviewers needs a reviewer and/or a second opinion and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Oct 11, 2024

yadij requested changes Oct 11, 2024

View reviewed changes

yadij added backport-to-v6 maintainer has approved these changes for v6 backporting and removed S-waiting-for-more-reviewers needs a reviewer and/or a second opinion labels Oct 11, 2024

rousskov reviewed Oct 11, 2024

View reviewed changes

rousskov added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-author author action is expected (and usually required) labels Oct 11, 2024

rousskov requested a review from yadij October 11, 2024 13:39

yadij approved these changes Oct 11, 2024

View reviewed changes

rousskov added the S-could-use-an-approval An approval may speed this PR merger (but is not required) label Oct 11, 2024

squid-anubis added the M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels label Oct 12, 2024

rousskov removed the S-could-use-an-approval An approval may speed this PR merger (but is not required) label Oct 12, 2024

squid-anubis closed this Oct 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 5449: Ignore SP and HTAB chars after chunk-size #1914

Bug 5449: Ignore SP and HTAB chars after chunk-size #1914

uhliarik commented Oct 10, 2024 •

edited by rousskov

Loading

squid-prbot commented Oct 10, 2024

uhliarik commented Oct 10, 2024

uhliarik commented Oct 10, 2024 •

edited

Loading

rousskov left a comment

rousskov Oct 10, 2024

uhliarik Oct 10, 2024

yadij Oct 11, 2024

rousskov Oct 11, 2024

rousskov commented Oct 10, 2024

uhliarik commented Oct 10, 2024

rousskov commented Oct 10, 2024 •

edited

Loading

uhliarik commented Oct 11, 2024

rousskov commented Oct 11, 2024

uhliarik commented Oct 11, 2024

rousskov left a comment •

edited

Loading

rousskov commented Oct 11, 2024

notroj commented Oct 11, 2024

yadij left a comment •

edited

Loading

yadij Oct 11, 2024

yadij Oct 11, 2024

uhliarik Oct 11, 2024 •

edited

Loading

rousskov Oct 11, 2024

yadij Oct 11, 2024

rousskov Oct 11, 2024

yadij commented Oct 11, 2024

notroj commented Oct 11, 2024

uhliarik commented Oct 11, 2024 •

edited

Loading

covener commented Oct 11, 2024

rousskov left a comment

rousskov Oct 11, 2024

rousskov Oct 11, 2024

notroj commented Oct 11, 2024

	ParseStrictBws(tok);
	// Also, this is not BWS. It is explicit "MUST reject" violation of RFC 9112 HTTP/1.1.
	(void)tok.skipOne(CharacterSet::WSP);

Bug 5449: Ignore SP and HTAB chars after chunk-size #1914

Bug 5449: Ignore SP and HTAB chars after chunk-size #1914

Conversation

uhliarik commented Oct 10, 2024 • edited by rousskov Loading

squid-prbot commented Oct 10, 2024

uhliarik commented Oct 10, 2024

uhliarik commented Oct 10, 2024 • edited Loading

rousskov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rousskov commented Oct 10, 2024

uhliarik commented Oct 10, 2024

rousskov commented Oct 10, 2024 • edited Loading

uhliarik commented Oct 11, 2024

rousskov commented Oct 11, 2024

uhliarik commented Oct 11, 2024

rousskov left a comment • edited Loading

Choose a reason for hiding this comment

rousskov commented Oct 11, 2024

notroj commented Oct 11, 2024

yadij left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

uhliarik Oct 11, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yadij commented Oct 11, 2024

notroj commented Oct 11, 2024

uhliarik commented Oct 11, 2024 • edited Loading

covener commented Oct 11, 2024

rousskov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

notroj commented Oct 11, 2024

uhliarik commented Oct 10, 2024 •

edited by rousskov

Loading

uhliarik commented Oct 10, 2024 •

edited

Loading

rousskov commented Oct 10, 2024 •

edited

Loading

rousskov left a comment •

edited

Loading

yadij left a comment •

edited

Loading

uhliarik Oct 11, 2024 •

edited

Loading

uhliarik commented Oct 11, 2024 •

edited

Loading