-
-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support buffering requests with unknown length (Transfer-Encoding: chunked) in RequestBodyBufferMiddleware #235
Conversation
1a593ee
to
830f9f4
Compare
@WyriHaximus I need your help with rebasing this for reactphp/promise-stream#3. In the tests there's currently this for the
Now what goes in is NOT a Is that really desired, i.e. should it be possible to construct |
2551059
to
400d0da
Compare
@andig Have you seen the documentation for the The following (completely contrived) example should allow request up to 8 MiB:
|
|
||
if ($size > $this->sizeLimit) { | ||
return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably keep this check: it makes sense to fail early in case the client is trying to upload a gigabyte file over a slow uplink :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Please forgive me for force-pushing. I didn't realize you were already reviewing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries, happy to see you're faster with coding than I am with reviewing
Small side note: I've also changed mentions of |
Reading this:
I understand the behaviour. It's still totally confusing though that- if you use the Shouldn't at least the
The buffer outputs Update maybe my explanation is too complicated. The |
I'm not sure I follow tbh. Our middleware implementation works around the concept of PSR-7 requests coming in and PSR-7 responses coming out (plus wrapped in promises). This means that there's no such thing as a "non-Psr7 body". What's special about this implementation is that bodies can be in a streaming state when a Does this help? |
Never mind. I've just noticed that I had to rewrite https://github.com/reactphp/http/pull/235/files#diff-5ff94d7c5e50a34ed39b11c33717c0b9R68 to use a different stream type. I'm probably making this too complicated ;) |
src/Server.php
Outdated
@@ -3,13 +3,12 @@ | |||
namespace React\Http; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you revert all changes in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will open separate PR for Psr7 but imho at least this change https://github.com/reactphp/http/pull/235/files#diff-c611f5024b5b13f69f779505d1c6735eR170 makes sense.
Do you mind the reordering of the use clauses by library?
src/Response.php
Outdated
@@ -21,7 +21,7 @@ public function __construct( | |||
$version = '1.1', | |||
$reason = null | |||
) { | |||
if ($body instanceof ReadableStreamInterface) { | |||
if ($body instanceof ReadableStreamInterface &! $body instanceof HttpBodyStream) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you revert all changes in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Artifact removed.
src/RequestHeaderParser.php
Outdated
@@ -92,7 +92,7 @@ private function parseRequest($headers) | |||
} | |||
|
|||
// parse request headers into obj implementing RequestInterface | |||
$request = g7\parse_request($headers); | |||
$request = Psr7\parse_request($headers); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason why you changed this? Or is this left over from earlier changes in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to have Psr7 consistently named in the codebase (not Psr7, g7, and PsrImplementation). Could move to separate PR?
src/HttpBodyStream.php
Outdated
@@ -25,7 +25,7 @@ class HttpBodyStream extends EventEmitter implements StreamInterface, ReadableSt | |||
* @param ReadableStreamInterface $input - Stream data from $stream as a body of a PSR-7 object4 | |||
* @param int|null $size - size of the data body | |||
*/ | |||
public function __construct(ReadableStreamInterface $input, $size) | |||
public function __construct(ReadableStreamInterface $input, $size = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you revert all changes in this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imho this change makes sense. It's aligned with the comment now, the constructor handles null and it prevents explicitly passing null for bodies of unknown length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that this adds noise and adds no value and as part of this PR, maybe suggest this in a separate PR? 👍
Despite this, I don't see what value this would provide, as this is an internal class only and this change basically means you're allowed th instantiate the class without considering what its "size" would be. If you feel this is relevant, I encourage you to file a PR so we can discuss this further 👍
@andig Seems you're getting there, keep up the good work 👍 |
@andig updated this PR's title to something that reflects what the current code is doing |
I've added one more commit to test for 413 in case on known but excessive content length. Still need to have another look at the README. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🎉
@andig could you update to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for keeping this updated! I feel this could use another higher level test to ensure consistent behavior (functional test?) and a README update, otherwise LGTM 👍
README.md
Outdated
unknown size (i.e. with `Transfer-Encoding: chunked` header). To simplify request | ||
handling for downstream middlewares or applications buffered `chunked` requests will | ||
have the `Transfer-Encoding` header removed and a `Content-Length` header with the | ||
actual request body size received added instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
README updates are much appreciated 👍 However, I feel this is a bit misleading here, as this makes it sounds like this is what this middleware handler does, whereas this is actually handled by the Server
itself. Also, the word "middlewares" doesn't exist (think "softwares"), I usually use "middleware handlers" instead :-) Can you rephrase this? 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it has actually become a duplicate since the response section already describes the behaviour. Removed.
README.md
Outdated
actual request body size received added instead. | ||
|
||
Note that this only affects incoming requests, the chunked transfer encoding for | ||
outgoing responses is not affected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence made much more sense for the old behavior where these requests were rejected. Does it make sense to you to remove this sentence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, removed.
}, function($error) { | ||
if ($error instanceof OverflowException) { | ||
return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit'); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this error handles the same condition as above while it actually does not. I suppose this could use some comments and and explicit test. Also, it looks like this should also close the incoming request to stop wasting incoming bandwidth if the limit is exceeded. Can you also add a test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could use some comments and and explicit test
I'll add two comments:
// request body of known size exceeding limit
// request body of unknown size exceeding limit during buffering
It is imho also tested: https://github.com/reactphp/http/pull/235/files/635a9659159e195c8dc3614439858a7535ade0c5#diff-5ff94d7c5e50a34ed39b11c33717c0b9R68 and https://github.com/reactphp/http/pull/235/files/635a9659159e195c8dc3614439858a7535ade0c5#diff-5ff94d7c5e50a34ed39b11c33717c0b9R68. Or I'm misunderstanding you.
Also, it looks like this should also close the incoming request to stop wasting incoming bandwidth if the limit is exceeded
This is as per original code which didn't close the body either. Is this responsibility of the middleware or the server? I.e. who should close the body if open after the middleware stack has finished or a middleware doesn't consume the body?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a relevant point, let's postpone this discussion for now and address this in a follow-up ticket 👍
See for example https://stackoverflow.com/questions/14250991/is-it-acceptable-for-a-server-to-send-a-http-response-before-the-entire-request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. Looking forward to your insights once this is in.
return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit'); | ||
} | ||
|
||
$body = $request->getBody(); | ||
if (!$body instanceof ReadableStreamInterface) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this check be moved up as this type of stream won't be buffered anyway? Why complain about size in this case (shouldn't happen when using the server at all, might as well just throw)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it makes sense to keep this like this in (the rare) case this middleware is used behind another buffering middleware handler, see also #235 (comment) 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clue What I mean is that- if not a ReadableStreamInterface
, which means we won't buffer it, we're still returning 413 although the middleware is not responsible for handling this request as it is right now.
My suggestion would be to move the ReadableStreamInterface
to the Beginning of the method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @clue could you check my question above?
What's missing to make make progress with this PR (and 0.8 in general)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, technically buffering and limiting sizes are in fact independent responsibilities. For practical reasons however, I think that it makes sense to bring them close together because buffering without applying a limit is actually harmful (potential for trivial DOS otherwise). If we accept the fact that checking size limits here, then I think it makes sense to also support buffered requests here. This means that I think it makes sense to also reject buffered requests that exceed the given size limits.
Accordingly, I think this code LGTM 👍 What do you think about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think about this?
@clue scratch head. I'm not getting my point across. We're only buffering and limiting PSR streams. We are however still limiting react-only streams.
This is imho wrong- as the buffering middleware is not responsible for these requests. I'm proposing- to be explicit- to change order of things like this:
public function __invoke(ServerRequestInterface $request, $stack)
{
$body = $request->getBody();
if (!$body instanceof ReadableStreamInterface) {
return $stack($request);
}
// request body of known size exceeding limit
if ($body->getSize() > $this->sizeLimit) {
return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit');
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it looks like we're talking past one another 🤷♂️ :-)
The getBody()
method is guaranteed to always return an instance of Psr\Message\StreamInterface
. This very instance may also implement the React\Stream\ReadableStreamInterface
("streaming state"), but this is not guaranteed (already in "buffered state").
When this middleware handler is used as documented, then the body is always in a "streaming state". If one were to use this middleware handler twice in the same stack (conceived example), then the first will see a body in "streaming state" while the second will see it in "buffered state".
If this conceived(!) second middleware handler uses a smaller size limit (which I would expect to be more of a programming error than intentional), then IMO the second middleware handler should still reject the message in my opinion.
If I understand your above suggestion correctly, then it looks like this would actually allow the body to pass through without applying the second limit at all.
I think it makes sense to add a test for this behavior to ensure that this is properly covered. Can you add a test for this or would you rather want me to add this to this PR? 👍
return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit'); | ||
} | ||
|
||
return $error; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably shouldn't return
an Exception
here :-) Can you fix this and add a test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clue good catch! Undecided what to return though- right now it's an HTTP 400 with the Exception message as text. Do you have a better suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"it depends" is probably the best answer here, so I think that this would be best addressed in a separate PR 👍 As such, I would suggest restoring the original behavior of simply returning a rejected promise (throwing an exception).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Do you think we need to guard for error actually being an exception and not something else?
Needs review of the last 2 commits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@andig Thank you so much for your patience and for keeping up with, the changes LGTM! 👍 Now let's get this in, cheers! 🍻 🎉
I've rebased your changes due to merge conflicts with master
and squashed this down to 2 commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 🎉
Implement chunked response handling, following discussion in #232