Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support buffering requests with unknown length (Transfer-Encoding: chunked) in RequestBodyBufferMiddleware #235

Merged
merged 2 commits into from
Nov 20, 2017

Conversation

andig
Copy link
Contributor

@andig andig commented Oct 9, 2017

Implement chunked response handling, following discussion in #232

@andig
Copy link
Contributor Author

andig commented Oct 13, 2017

@WyriHaximus I need your help with rebasing this for reactphp/promise-stream#3. In the tests there's currently this for the RequestBodyBufferMiddlewareTest:

public function testExcessiveSizeReturnsError413()
{
    $stream = new BufferStream(2);
    $stream->write('aa');

    $serverRequest = new ServerRequest(
        'GET',
        'https://example.com/',
        array(),
        $stream
    );

    $buffer = new RequestBodyBufferMiddleware(1);
    $response = $buffer(
        $serverRequest,
        function () {}
    );

    $this->assertSame(413, $response->getStatusCode());
}

Now what goes in is NOT a ReadableStreamInterface and therefore returned early before any buffering in RequestBodyBufferMiddleware.php.

Is that really desired, i.e. should it be possible to construct ServerRequest with a body that doesn't implement Psr7 semantics + ReadableStreamInterface (and therefore actually is a HttpBodyStream? Hidden bug or feature that I fail to grasp?

@andig andig force-pushed the chunked branch 3 times, most recently from 2551059 to 400d0da Compare October 13, 2017 18:02
@clue
Copy link
Member

clue commented Oct 13, 2017

@andig Have you seen the documentation for the RequestBodyBufferMiddleware? I think that this middleware should support all valid requests. In a normal stack this will be the incoming streaming request, but another prior middleware handler may also expose a buffered request to this middleware handler.

The following (completely contrived) example should allow request up to 8 MiB:

$middleware = new MiddlewareRunner([
    new RequestBodyBufferMiddleware(16 * 1024 * 1024), // 16 MiB
    new RequestBodyBufferMiddleware(8 * 1024 * 1024), // 8 MiB
});


if ($size > $this->sizeLimit) {
return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit');
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably keep this check: it makes sense to fail early in case the client is trying to upload a gigabyte file over a slow uplink :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Please forgive me for force-pushing. I didn't realize you were already reviewing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, happy to see you're faster with coding than I am with reviewing :shipit:

@andig andig changed the title [WIP] Draft receiving chunked requests Handle chunked requests Oct 13, 2017
@andig
Copy link
Contributor Author

andig commented Oct 13, 2017

Small side note: I've also changed mentions of g7 and Psr7Implemention to Psr7 for sake of coding consistency.

@andig
Copy link
Contributor Author

andig commented Oct 13, 2017

@clue

Have you seen the documentation for the RequestBodyBufferMiddleware? I think that this middleware should support all valid requests.

Reading this:

This can be useful if full PSR-7 compatibility is needed for the request handler and the default streaming request body handling is not needed.

I understand the behaviour. It's still totally confusing though that- if you use the RequestBodyBufferMiddleware and send in a react stream- it can be of any size (as long as it doesn't tell you which size) and will not be buffered or rejected simply because the RequestBodyBufferMiddleware will not do its job in this case.

Shouldn't at least the RequestBodyBufferMiddleware reject non-Psr7 bodies as they are used in the tests? Along the lines of "you want buffered request bodies, so you should send in valid Psr requests"?

The following (completely contrived) example should allow request up to 8 MiB ... In a normal stack this will be the incoming streaming request, but another prior middleware handler may also expose a buffered request to this middleware handler.

The buffer outputs BufferStream which is of StreamInterface and can be fed to the next buffer. Imho you should not have a middleware in between that returns ReadableStreamInterfaces or similar simply because after the buffer you should be able to expect 100% compilant Psr7 request bodies?

Update maybe my explanation is too complicated. The Server wraps every kind of body inside a HttpBodyStream which implements StreamInterface. No middleware on the stack should remove or accept other interfaces than StreamInterface - or we should have a big disclaimer in the docs that functionality may depend on type of input stream?

@clue
Copy link
Member

clue commented Oct 13, 2017

I'm not sure I follow tbh. Our middleware implementation works around the concept of PSR-7 requests coming in and PSR-7 responses coming out (plus wrapped in promises). This means that there's no such thing as a "non-Psr7 body".

What's special about this implementation is that bodies can be in a streaming state when a ReadableStreamInterface is attached. This allows us to transport bodies of arbitrary sizes without having to store huge bodies in memory.

Does this help?

@andig
Copy link
Contributor Author

andig commented Oct 14, 2017

I'm not sure I follow tbh.

Never mind. I've just noticed that I had to rewrite https://github.com/reactphp/http/pull/235/files#diff-5ff94d7c5e50a34ed39b11c33717c0b9R68 to use a different stream type. I'm probably making this too complicated ;)

src/Server.php Outdated
@@ -3,13 +3,12 @@
namespace React\Http;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you revert all changes in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will open separate PR for Psr7 but imho at least this change https://github.com/reactphp/http/pull/235/files#diff-c611f5024b5b13f69f779505d1c6735eR170 makes sense.

Do you mind the reordering of the use clauses by library?

src/Response.php Outdated
@@ -21,7 +21,7 @@ public function __construct(
$version = '1.1',
$reason = null
) {
if ($body instanceof ReadableStreamInterface) {
if ($body instanceof ReadableStreamInterface &! $body instanceof HttpBodyStream) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you revert all changes in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Artifact removed.

@@ -92,7 +92,7 @@ private function parseRequest($headers)
}

// parse request headers into obj implementing RequestInterface
$request = g7\parse_request($headers);
$request = Psr7\parse_request($headers);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you changed this? Or is this left over from earlier changes in this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to have Psr7 consistently named in the codebase (not Psr7, g7, and PsrImplementation). Could move to separate PR?

@@ -25,7 +25,7 @@ class HttpBodyStream extends EventEmitter implements StreamInterface, ReadableSt
* @param ReadableStreamInterface $input - Stream data from $stream as a body of a PSR-7 object4
* @param int|null $size - size of the data body
*/
public function __construct(ReadableStreamInterface $input, $size)
public function __construct(ReadableStreamInterface $input, $size = null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you revert all changes in this file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho this change makes sense. It's aligned with the comment now, the constructor handles null and it prevents explicitly passing null for bodies of unknown length?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this adds noise and adds no value and as part of this PR, maybe suggest this in a separate PR? 👍

Despite this, I don't see what value this would provide, as this is an internal class only and this change basically means you're allowed th instantiate the class without considering what its "size" would be. If you feel this is relevant, I encourage you to file a PR so we can discuss this further 👍

@WyriHaximus
Copy link
Member

@andig Seems you're getting there, keep up the good work 👍

@WyriHaximus
Copy link
Member

@andig 🎉 https://github.com/reactphp/promise-stream/releases/tag/v0.1.2 🎉

@WyriHaximus WyriHaximus added this to the v0.8.0 milestone Oct 19, 2017
@WyriHaximus WyriHaximus changed the title Handle chunked requests Add chunked encoding support to RequestBodyBufferMiddleware Oct 19, 2017
@WyriHaximus
Copy link
Member

@andig updated this PR's title to something that reflects what the current code is doing

@WyriHaximus WyriHaximus changed the title Add chunked encoding support to RequestBodyBufferMiddleware Support requests without length in RequestBodyBufferMiddleware Oct 19, 2017
@andig
Copy link
Contributor Author

andig commented Oct 19, 2017

I've added one more commit to test for 413 in case on known but excessive content length. Still need to have another look at the README.

Copy link
Member

@WyriHaximus WyriHaximus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

@WyriHaximus
Copy link
Member

@andig could you update to ^1.0 || ^0.1.2 since 1.0 has been release earlier today? 🎉

Copy link
Member

@clue clue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for keeping this updated! I feel this could use another higher level test to ensure consistent behavior (functional test?) and a README update, otherwise LGTM 👍

README.md Outdated
unknown size (i.e. with `Transfer-Encoding: chunked` header). To simplify request
handling for downstream middlewares or applications buffered `chunked` requests will
have the `Transfer-Encoding` header removed and a `Content-Length` header with the
actual request body size received added instead.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README updates are much appreciated 👍 However, I feel this is a bit misleading here, as this makes it sounds like this is what this middleware handler does, whereas this is actually handled by the Server itself. Also, the word "middlewares" doesn't exist (think "softwares"), I usually use "middleware handlers" instead :-) Can you rephrase this? 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has actually become a duplicate since the response section already describes the behaviour. Removed.

README.md Outdated
actual request body size received added instead.

Note that this only affects incoming requests, the chunked transfer encoding for
outgoing responses is not affected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence made much more sense for the old behavior where these requests were rejected. Does it make sense to you to remove this sentence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, removed.

}, function($error) {
if ($error instanceof OverflowException) {
return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit');
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this error handles the same condition as above while it actually does not. I suppose this could use some comments and and explicit test. Also, it looks like this should also close the incoming request to stop wasting incoming bandwidth if the limit is exceeded. Can you also add a test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could use some comments and and explicit test

I'll add two comments:

        // request body of known size exceeding limit
        // request body of unknown size exceeding limit during buffering

It is imho also tested: https://github.com/reactphp/http/pull/235/files/635a9659159e195c8dc3614439858a7535ade0c5#diff-5ff94d7c5e50a34ed39b11c33717c0b9R68 and https://github.com/reactphp/http/pull/235/files/635a9659159e195c8dc3614439858a7535ade0c5#diff-5ff94d7c5e50a34ed39b11c33717c0b9R68. Or I'm misunderstanding you.

Also, it looks like this should also close the incoming request to stop wasting incoming bandwidth if the limit is exceeded

This is as per original code which didn't close the body either. Is this responsibility of the middleware or the server? I.e. who should close the body if open after the middleware stack has finished or a middleware doesn't consume the body?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a relevant point, let's postpone this discussion for now and address this in a follow-up ticket :shipit: 👍

See for example https://stackoverflow.com/questions/14250991/is-it-acceptable-for-a-server-to-send-a-http-response-before-the-entire-request

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. Looking forward to your insights once this is in.

return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit');
}

$body = $request->getBody();
if (!$body instanceof ReadableStreamInterface) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check be moved up as this type of stream won't be buffered anyway? Why complain about size in this case (shouldn't happen when using the server at all, might as well just throw)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it makes sense to keep this like this in (the rare) case this middleware is used behind another buffering middleware handler, see also #235 (comment) 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clue What I mean is that- if not a ReadableStreamInterface, which means we won't buffer it, we're still returning 413 although the middleware is not responsible for handling this request as it is right now.

My suggestion would be to move the ReadableStreamInterface to the Beginning of the method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @clue could you check my question above?
What's missing to make make progress with this PR (and 0.8 in general)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, technically buffering and limiting sizes are in fact independent responsibilities. For practical reasons however, I think that it makes sense to bring them close together because buffering without applying a limit is actually harmful (potential for trivial DOS otherwise). If we accept the fact that checking size limits here, then I think it makes sense to also support buffered requests here. This means that I think it makes sense to also reject buffered requests that exceed the given size limits.

Accordingly, I think this code LGTM 👍 What do you think about this?

Copy link
Contributor Author

@andig andig Nov 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about this?

@clue scratch head. I'm not getting my point across. We're only buffering and limiting PSR streams. We are however still limiting react-only streams.

This is imho wrong- as the buffering middleware is not responsible for these requests. I'm proposing- to be explicit- to change order of things like this:

public function __invoke(ServerRequestInterface $request, $stack)
{
    $body = $request->getBody();

    if (!$body instanceof ReadableStreamInterface) {
        return $stack($request);
    }

    // request body of known size exceeding limit
    if ($body->getSize() > $this->sizeLimit) {
        return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit');
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like we're talking past one another 🤷‍♂️ :-)

The getBody() method is guaranteed to always return an instance of Psr\Message\StreamInterface. This very instance may also implement the React\Stream\ReadableStreamInterface ("streaming state"), but this is not guaranteed (already in "buffered state").

When this middleware handler is used as documented, then the body is always in a "streaming state". If one were to use this middleware handler twice in the same stack (conceived example), then the first will see a body in "streaming state" while the second will see it in "buffered state".

If this conceived(!) second middleware handler uses a smaller size limit (which I would expect to be more of a programming error than intentional), then IMO the second middleware handler should still reject the message in my opinion.

If I understand your above suggestion correctly, then it looks like this would actually allow the body to pass through without applying the second limit at all.

I think it makes sense to add a test for this behavior to ensure that this is properly covered. Can you add a test for this or would you rather want me to add this to this PR? 👍

return new Response(413, array('Content-Type' => 'text/plain'), 'Request body exceeds allowed limit');
}

return $error;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably shouldn't return an Exception here :-) Can you fix this and add a test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clue good catch! Undecided what to return though- right now it's an HTTP 400 with the Exception message as text. Do you have a better suggestion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"it depends" is probably the best answer here, so I think that this would be best addressed in a separate PR 👍 As such, I would suggest restoring the original behavior of simply returning a rejected promise (throwing an exception).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Do you think we need to guard for error actually being an exception and not something else?

@WyriHaximus
Copy link
Member

What is the status here @clue and @andig?

@andig
Copy link
Contributor Author

andig commented Nov 17, 2017

Needs review of the last 2 commits

Copy link
Member

@clue clue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andig Thank you so much for your patience and for keeping up with, the changes LGTM! 👍 Now let's get this in, cheers! 🍻 🎉

I've rebased your changes due to merge conflicts with master and squashed this down to 2 commits.

@clue clue changed the title Support requests without length in RequestBodyBufferMiddleware Support buffering requests with unknown length (Transfer-Encoding: chunked) in RequestBodyBufferMiddleware Nov 20, 2017
Copy link
Member

@WyriHaximus WyriHaximus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 🎉 :shipit:

@WyriHaximus WyriHaximus merged commit 1d5bd0b into reactphp:master Nov 20, 2017
@andig andig deleted the chunked branch November 21, 2017 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants