-
Notifications
You must be signed in to change notification settings - Fork 531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix infinite recursion when parsing HTTP chunks #1553
Fix infinite recursion when parsing HTTP chunks #1553
Conversation
This change stops infinite HttpStateData recursion with at-max-capacity inBuf. Such inBuf prevents progress in the following call chain: * processReply() * processReplyBody() and decodeAndWriteReplyBody() * maybeReadVirginBody() * maybeMakeSpaceAvailable() -- tries but fails to quit processing * processReply() To stop recursion, decodeAndWriteReplyBodyI() now predicts that maybeMakeSpaceAvailable() will detect at-max-capacity inBuf and breaks the recursion by declaring a fatal chunked parsing error. XXX: This change addresses a few known infinite recursion cases, but it is conceptually wrong for several (partially overlapping) reasons: 1. It assumes that at-max-capacity inBuf is a permanent state (i.e. things cannot improve with time). This assumption is false. For example, as more space frees up in the adaptation pipeline, we will clear responseBodyBuffer, and Client::calcBufferSpaceToReserve() may allow us to grow inBuf and read more chunked metadata. 2. New decodeAndWriteReplyBody() code assumption that inability (or lack of necessity) to provide usable buffer space implies that chunk metadata is huge is false: calcReadBufferSpaceToReserve() may become zero even with tiny buffered metadata. 3. The resulting recursion termination is arguably accidental. The call loop remains intact, and minor code changes (or unknown use cases) may resurrect the same infinite recursion problem. ---- This change is based on small portions of the stuck official PR squid-cache#736 (i.e. old Factory branch SQUID-533-Optimize-memalloc-server-reads). It does not include many important fixes and improvements from that branch.
I am reverting previous commit b631d5a9c933fa849f6887c433020d0125bc08f4 to address the problems identified in that commit message.
This change stops infinite HttpStateData recursion with at-max-capacity inBuf. Such inBuf prevents progress in the following call chain: * processReply() * processReplyBody() and decodeAndWriteReplyBody() * maybeReadVirginBody() * maybeMakeSpaceAvailable() -- tries but fails to quit processing * processReply() HttpStateData::maybeMakeSpaceAvailable() no longer calls processReply(), preventing recursion. maybeReadVirginBody() now aborts transactions that would otherwise get stalled due to full read buffer at its maximum capacity. This change requires that all maybeReadVirginBody() callers do actually need more response data to make progress. AFAICT, that (natural) invariant holds. We moved transaction stalling check from maybeMakeSpaceAvailable() into its previous callers. Without that move, maybeMakeSpaceAvailable() would have to handle both abortTransaction() and delayRead() cases. Besides increased code complexity, that would trigger some premature delayRead() calls (at maybeReadVirginBody() time). Deciding whether to delay socket reads is complicated, the delay mechanism is expensive, and delaying may become unnecessary by the time the socket becomes readable, so it is best to continue to only delayRead() at readReply() time, when there is no other choice left. maybeReadVirginBody() mishandled cases where progress was possible, but not _immediately_ -- it did nothing in those cases, probably stalling transactions when maybeMakeSpaceAvailable() returned false but did not call processReply(). This is now fixed: maybeReadVirginBody() now starts waiting for the socket to be ready for reading in those cases, effectively passing control to readReply() that handles them. maybeReadVirginBody() prematurely grew buffer for future socket reads. As a (positive) side effect of the above refactoring, we now delay buffer growth until the actual read(2) time, which is best for performance. Most likely, this premature buffer growth was an accident: maybeReadVirginBody() correctly called maybeMakeSpaceAvailable() with doGrow set to false. However, maybeMakeSpaceAvailable() misinterpreted doGrow as a "do not actually do it" parameter. That bug is now gone. This recursion bug was discovered and detailed by Joshua Rogers at https://megamansec.github.io/Squid-Security-Audit/ where it was filed as "Chunked Encoding Stack Overflow".
These should probably wait until we resume official PR squid-cache#736 (i.e. old Factory branch SQUID-533-Optimize-memalloc-server-reads) work or start pushing tiny minimal PRs in preparation for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes fix the problem in my tests. I am posting this PR as a Draft because I would like to run more tests and double check that there are no places where we should add the removed processReply() call. However, I think it is fairly safe to start reviewing this rushed fix when you get a chance, at least to make sure you are OK with various TODOs and XXXs being left as they are now.
|
||
const auto readSizeMax = maybeMakeSpaceAvailable(moreDataPermission.value()); | ||
// TODO: Move this logic inside maybeMakeSpaceAvailable(): | ||
const auto readSizeWanted = readSizeMax ? entry->bytesWanted(Range<size_t>(0, readSizeMax)) : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is essentially unchanged by this PR, but if others think it should be moved inside maybeMakeSpaceAvailable() in this PR, I would be happy to do that.
// we may need to grow the buffer | ||
inBuf.reserveSpace(read_size); | ||
debugs(11, 8, (!flags.do_next_read ? "will not" : "may") << |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed "will not" clause from this debugs() line because we only "make space available" when we are on a "true do_next_read" path. We may not actually read for various reasons, but not because flags.do_next_read is false.
Done, with the following conclusion: I do not think more processReply() calls are needed in this PR. We already call processReply() after we received more data from a peer and after we are done handling a 1xx control message. The only other condition/event I can think of that would require us to parse already buffered from-peer bytes is when the adaptation drains a previously overflowed BodyPipe buffer. In that Client::noteMoreBodySpaceAvailable() code, we call maybeReadVirginBody() instead of processReply() but that should work OK for now AFAICT, and, more importantly improving that fragile code (with some rather heavy implicit assumptions!) is outside this PR scope. I am marking this PR as ready for review. We still plan on doing more testing in the background, and I do not intend to merge this PR (even if it is approved) until that testing is completed. |
return; | ||
} | ||
|
||
const auto readSizeMax = maybeMakeSpaceAvailable(moreDataPermission.value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, is there a similar problem in Server code, i.e., do we need some 'canBufferMoreRequestBytes()' check before calling Server::maybeMakeSpaceAvailable()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Server code probably suffers from various buffering problems (with a different set of constraints). There is a related check (with a misleading comment) there already, but I doubt it is enough:
// we can only read if there is more than 1 byte of space free
if (Config.maxRequestBufferSize - inBuf.length() < 2)
return;
Somebody will need to study Server issues and address them separately. Hopefully, there is no infinite recursion there (the subject of this PR).
This should also mean that the other maybeMakeSpaceAvailable(true) call in HttpStateData::readReply() worked incorrectly, i.e., it did not grow the buffer. |
This change stops infinite HttpStateData recursion with at-max-capacity inBuf. Such inBuf prevents progress in the following call chain: * processReply() * processReplyBody() and decodeAndWriteReplyBody() * maybeReadVirginBody() * maybeMakeSpaceAvailable() -- tries but fails to quit processing * processReply() HttpStateData::maybeMakeSpaceAvailable() no longer calls processReply(), preventing recursion. maybeReadVirginBody() now aborts transactions that would otherwise get stalled due to full read buffer at its maximum capacity. This change requires that all maybeReadVirginBody() callers do actually need more response data to make progress. AFAICT, that (natural) invariant holds. We moved transaction stalling check from maybeMakeSpaceAvailable() into its previous callers. Without that move, maybeMakeSpaceAvailable() would have to handle both abortTransaction() and delayRead() cases. Besides increased code complexity, that would trigger some premature delayRead() calls (at maybeReadVirginBody() time). Deciding whether to delay socket reads is complicated, the delay mechanism is expensive, and delaying may become unnecessary by the time the socket becomes readable, so it is best to continue to only delayRead() at readReply() time, when there is no other choice left. maybeReadVirginBody() mishandled cases where progress was possible, but not _immediately_ -- it did nothing in those cases, probably stalling transactions when maybeMakeSpaceAvailable() returned false but did not call processReply(). This is now fixed: maybeReadVirginBody() now starts waiting for the socket to be ready for reading in those cases, effectively passing control to readReply() that handles them. maybeReadVirginBody() prematurely grew buffer for future socket reads. As a (positive) side effect of the above refactoring, we now delay buffer growth until the actual read(2) time, which is best for performance. Most likely, this premature buffer growth was an accident: maybeReadVirginBody() correctly called maybeMakeSpaceAvailable() with doGrow set to false. However, maybeMakeSpaceAvailable() misinterpreted doGrow as a "do not actually do it" parameter. That bug is now gone. This recursion bug was discovered and detailed by Joshua Rogers at https://megamansec.github.io/Squid-Security-Audit/ where it was filed as "Chunked Encoding Stack Overflow".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alex: We still plan on doing more testing in the background, and I do not intend to merge this PR (even if it is approved) until that testing is completed.
Others have merged this PR. We are still running those tests. Eduard or I will follow up if those tests expose any significant problems associated with this PR changes.
Eduard: This should also mean that the other maybeMakeSpaceAvailable(true) call in HttpStateData::readReply() worked incorrectly, i.e., it did not grow the buffer.
Yes, here is how #736 described that bug:
Also fixed commit 1ad6851 to delay buffer allocations until we are ready
to read from the server. That old commit made buffer allocations in
maybeMakeSpaceAvailable() optional so that they can be delayed until we
actually read from the peer socket. Delaying buffer allocation until
read(2) is the right thing to do. The callers (correctly) passed a
boolean flag to trigger the allocation, but the method interpreted the
flag as if it meant "do not allocate". That is, commit tried to delay
allocations but actually allocated early.
There is more in that PR description, including the explanation why the code "worked" despite doing the opposite of what was intended/expected.
return; | ||
} | ||
|
||
const auto readSizeMax = maybeMakeSpaceAvailable(moreDataPermission.value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Server code probably suffers from various buffering problems (with a different set of constraints). There is a related check (with a misleading comment) there already, but I doubt it is enough:
// we can only read if there is more than 1 byte of space free
if (Config.maxRequestBufferSize - inBuf.length() < 2)
return;
Somebody will need to study Server issues and address them separately. Hopefully, there is no infinite recursion there (the subject of this PR).
@rousskov preparing the advisory for announcement. Was this patch sufficient to fix the issue or are/were there additional changes necessary? |
AFAIK, commit 50c5af8 fixes infinite processReply-maybeMakeSpaceAvailable-processReply recursion when parsing HTTP response chunks. Additional changes are necessary to fix other known problems in response reading/buffering code, along the lines of #736, but none of those "other known problems" are infinite chunks parsing recursion IIRC. The Server (i.e. request handling) code is a separate can of worms that we have not studied from the chunk parsing recursion point of view, but the published vulnerability was specific to Client handling of responses. |
This change stops infinite HttpStateData recursion with at-max-capacity inBuf. Such inBuf prevents progress in the following call chain: * processReply() * processReplyBody() and decodeAndWriteReplyBody() * maybeReadVirginBody() * maybeMakeSpaceAvailable() -- tries but fails to quit processing * processReply() HttpStateData::maybeMakeSpaceAvailable() no longer calls processReply(), preventing recursion. maybeReadVirginBody() now aborts transactions that would otherwise get stalled due to full read buffer at its maximum capacity. This change requires that all maybeReadVirginBody() callers do actually need more response data to make progress. AFAICT, that (natural) invariant holds. We moved transaction stalling check from maybeMakeSpaceAvailable() into its previous callers. Without that move, maybeMakeSpaceAvailable() would have to handle both abortTransaction() and delayRead() cases. Besides increased code complexity, that would trigger some premature delayRead() calls (at maybeReadVirginBody() time). Deciding whether to delay socket reads is complicated, the delay mechanism is expensive, and delaying may become unnecessary by the time the socket becomes readable, so it is best to continue to only delayRead() at readReply() time, when there is no other choice left. maybeReadVirginBody() mishandled cases where progress was possible, but not _immediately_ -- it did nothing in those cases, probably stalling transactions when maybeMakeSpaceAvailable() returned false but did not call processReply(). This is now fixed: maybeReadVirginBody() now starts waiting for the socket to be ready for reading in those cases, effectively passing control to readReply() that handles them. maybeReadVirginBody() prematurely grew buffer for future socket reads. As a (positive) side effect of the above refactoring, we now delay buffer growth until the actual read(2) time, which is best for performance. Most likely, this premature buffer growth was an accident: maybeReadVirginBody() correctly called maybeMakeSpaceAvailable() with doGrow set to false. However, maybeMakeSpaceAvailable() misinterpreted doGrow as a "do not actually do it" parameter. That bug is now gone. This recursion bug was discovered and detailed by Joshua Rogers at https://megamansec.github.io/Squid-Security-Audit/ where it was filed as "Chunked Encoding Stack Overflow".
This change stops infinite HttpStateData recursion with at-max-capacity inBuf. Such inBuf prevents progress in the following call chain: * processReply() * processReplyBody() and decodeAndWriteReplyBody() * maybeReadVirginBody() * maybeMakeSpaceAvailable() -- tries but fails to quit processing * processReply() HttpStateData::maybeMakeSpaceAvailable() no longer calls processReply(), preventing recursion. maybeReadVirginBody() now aborts transactions that would otherwise get stalled due to full read buffer at its maximum capacity. This change requires that all maybeReadVirginBody() callers do actually need more response data to make progress. AFAICT, that (natural) invariant holds. We moved transaction stalling check from maybeMakeSpaceAvailable() into its previous callers. Without that move, maybeMakeSpaceAvailable() would have to handle both abortTransaction() and delayRead() cases. Besides increased code complexity, that would trigger some premature delayRead() calls (at maybeReadVirginBody() time). Deciding whether to delay socket reads is complicated, the delay mechanism is expensive, and delaying may become unnecessary by the time the socket becomes readable, so it is best to continue to only delayRead() at readReply() time, when there is no other choice left. maybeReadVirginBody() mishandled cases where progress was possible, but not _immediately_ -- it did nothing in those cases, probably stalling transactions when maybeMakeSpaceAvailable() returned false but did not call processReply(). This is now fixed: maybeReadVirginBody() now starts waiting for the socket to be ready for reading in those cases, effectively passing control to readReply() that handles them. maybeReadVirginBody() prematurely grew buffer for future socket reads. As a (positive) side effect of the above refactoring, we now delay buffer growth until the actual read(2) time, which is best for performance. Most likely, this premature buffer growth was an accident: maybeReadVirginBody() correctly called maybeMakeSpaceAvailable() with doGrow set to false. However, maybeMakeSpaceAvailable() misinterpreted doGrow as a "do not actually do it" parameter. That bug is now gone. This recursion bug was discovered and detailed by Joshua Rogers at https://megamansec.github.io/Squid-Security-Audit/ where it was filed as "Chunked Encoding Stack Overflow".
This change stops infinite HttpStateData recursion with at-max-capacity
inBuf. Such inBuf prevents progress in the following call chain:
HttpStateData::maybeMakeSpaceAvailable() no longer calls processReply(),
preventing recursion.
maybeReadVirginBody() now aborts transactions that would otherwise get
stalled due to full read buffer at its maximum capacity. This change
requires that all maybeReadVirginBody() callers do actually need more
response data to make progress. AFAICT, that (natural) invariant holds.
We moved transaction stalling check from maybeMakeSpaceAvailable() into
its previous callers. Without that move, maybeMakeSpaceAvailable() would
have to handle both abortTransaction() and delayRead() cases. Besides
increased code complexity, that would trigger some premature delayRead()
calls (at maybeReadVirginBody() time). Deciding whether to delay socket
reads is complicated, the delay mechanism is expensive, and delaying may
become unnecessary by the time the socket becomes readable, so it is
best to continue to only delayRead() at readReply() time, when there is
no other choice left.
maybeReadVirginBody() mishandled cases where progress was possible, but
not immediately -- it did nothing in those cases, probably stalling
transactions when maybeMakeSpaceAvailable() returned false but did not
call processReply(). This is now fixed: maybeReadVirginBody() now starts
waiting for the socket to be ready for reading in those cases,
effectively passing control to readReply() that handles them.
maybeReadVirginBody() prematurely grew buffer for future socket reads.
As a (positive) side effect of the above refactoring, we now delay
buffer growth until the actual read(2) time, which is best for
performance. Most likely, this premature buffer growth was an accident:
maybeReadVirginBody() correctly called maybeMakeSpaceAvailable() with
doGrow set to false. However, maybeMakeSpaceAvailable() misinterpreted
doGrow as a "do not actually do it" parameter. That bug is now gone.
This recursion bug was discovered and detailed by Joshua Rogers at
https://megamansec.github.io/Squid-Security-Audit/
where it was filed as "Chunked Encoding Stack Overflow".