Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode chunked transfer encoding for incoming requests #116

Merged
merged 3 commits into from
Feb 19, 2017

Conversation

legionth
Copy link
Contributor

This ensures that only decoded body data will be emitted via the request object.

Resolves / closes #96

@legionth legionth force-pushed the chunk-encoding branch 3 times, most recently from 20ab519 to c830685 Compare February 13, 2017 15:37
Copy link
Member

@clue clue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This seems much more approachable than the previous PR, only added some minor remarks afaict 👍

if (strpos($header, ';') !== false) {
$array = explode(';', $header);
$hexValue = $array[0];
$start = strlen($header) + 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like $start may always be $positionClrf + 2, so this var may not be needed at all?

$start = strlen($header) + 2;
}

if (dechex(hexdec($hexValue)) !== $hexValue || hexdec($hexValue) > 2147483647) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to get negative numbers here? Is the additional bound check really needed?

return;
}

$this->chunkSize = hexdec($hexValue);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but can probably moved up as to avoid some function calls? 👍


if ($positionClrf === false) {
// Header shouldn't be bigger than 1024 bytes
if (strlen($this->buffer) > 1024) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but can probably also use isset($this->buffer[$max]) to avoid some function calls? 👍

$this->parser->on('end', $this->expectCallableNever());
$this->parser->on('error', $this->expectCallableNever());

$this->input->emit('data', array("7FFFFFFE\r\n"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a 32bit assumption here? (see also above)

if ($this->transferredSize > $this->chunkSize) {
$this->handleError(new \Exception('The chunk is bigger than expected'));
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this check, should this not be limited by the above substr() already? (see also below CRLF check)

$this->buffer = (string)substr($this->buffer, strlen($chunk));
}

if (strpos($this->buffer, static::CRLF) !== false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks for any CRLF? What if the buffer is empty/incomplete here? Shouldn't this have to buffer and/or reject if there's not a CRLF at the head position instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this resolved? This still checks for any position afaict? 4\r\ntestWOOT\r\n

{
$this->emit('error', array($e));
$this->input->removeListener('data', array($this, 'handleData'));
$this->close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't closing already remove all listeners? (see also below)

@clue clue added this to the v0.4.6 milestone Feb 14, 2017
@legionth
Copy link
Contributor Author

I think, I have handled all the remarks @clue had. Have a look.

$this->input->removeListener('data', array($this, 'handleData'));
$this->input->removeListener('end', array($this, 'handleEnd'));
$this->input->removeListener('error', array($this, 'handleError'));
$this->input->removeListener('close', array($this, 'close'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unneeded? Closing the input stream should remove all listeners.

if ($positionClrf === false) {
// Header shouldn't be bigger than 1024 bytes
if (isset($this->buffer[1024])) {
$this->handleError(new \Exception('Chunk size inclusive extension bigger than 1024 bytes'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Message should refer to size of "chunk header"?


if ($positionClrf === false) {
// Header shouldn't be bigger than 1024 bytes
if (isset($this->buffer[1024])) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

magic constant?

}

$this->chunkSize = hexdec($hexValue);
if (dechex($this->chunkSize) !== $hexValue || $hexValue < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not seeing a test for $hexValue < 0? Is this needed? Documentation suggests otherwise

$this->buffer = (string)substr($this->buffer, strlen($chunk));
}

if (strpos($this->buffer, static::CRLF) !== false) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this resolved? This still checks for any position afaict? 4\r\ntestWOOT\r\n

if ($this->headerCompleted) {
if (strlen($this->buffer) > 2 && $this->chunkSize === $this->transferredSize) {
// Send error event, the first 2 characters should be CLRF
$this->handleError(new \Exception('Chunk does not end with a CLRF'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm not missing something, but where do we actually check "the first 2 characters should be CLRF"? (see also above)

Copy link
Contributor Author

@legionth legionth Feb 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment the previous comment, but: Yes 4\r\ntestWOOT\r\n is handled. See tests testNoCrlfInChunk and testNoCrlfInChunkSplitted in ChunkedDecoderTests.

To this comment. This has to be seen in relation to the previous if-statements. If the previous statement couldn't find a CLRF, this is still buffered. But only if the current string length of the buffer isn't >2. That would mean that the data in the buffer isn't correct chunked encoding.

// Send error event, the first 2 characters should be CLRF
$this->handleError(new \Exception('Chunk does not end with a CLRF'));
}
return;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? Doesn't the next iteration check this? Perhaps this is just a bit unclear and could use some comments?

$data .= "2\r\nhi\r\n";

$this->connection->emit('data', array($data));
$this->assertEquals('hello', $buffer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can probably use expectCallableOnceWith('hello'); here? Also, can you add tests for the end, error and close events as well? (See also other test cases)

@legionth
Copy link
Contributor Author

Is this resolved? This still checks for any position afaict? 4\r\ntestWOOT\r\n
Show outdated

The test case for this were valid because the header check have failed for this. Another test like 4\r\ntest3\r\nbla, would led to a misbehaviour.

With the newest commits. This should be fixed.

Copy link
Member

@clue clue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! 🎉

Hate being that fussy in review, but I believe these results show it's well worth it, appreciate your endurance 👍

@clue clue modified the milestones: v0.5.1, v0.4.6 Feb 14, 2017
@clue
Copy link
Member

clue commented Feb 14, 2017

For the reference: The changes LGTM, but we've had to move this to another milestone and we're going to release the v0.5.0 first, so we won't be able to merge this immediately 👍

class ChunkedDecoder extends EventEmitter implements ReadableStreamInterface
{
const CRLF = "\r\n";
const MAX = 1024;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be more descriptive, eg. MAX_CHUNK_HEADER_SIZE

@legionth
Copy link
Contributor Author

Thank you for your input @jsor. Changed the variable name 😄

@WyriHaximus
Copy link
Member

@legionth Just skimmed over the test, and something I've done on react/http-client is run a valid chunked encoded body per character through the decoder to ensure it is valid no matter how slow and in what kind of chunks it comes in: https://github.com/reactphp/http-client/blob/master/tests/DecodeChunkedStreamTest.php#L36 is it an idea to do that here as well?

@legionth
Copy link
Contributor Author

@WyriHaximus good idea 👍 . Added an tests, have a look.

public function handleEnd()
{
if (!$this->closed) {
$this->handleError(new \Exception('Unexpected `end` event'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these ` intended here?

if ($positionClrf === false) {
// Header shouldn't be bigger than 1024 bytes
if (isset($this->buffer[static::MAX_CHUNK_HEADER_SIZE])) {
$this->handleError(new \Exception('Chunk header size inclusive extension bigger than 1024 bytes'));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use static::MAX_CHUNK_HEADER_SIZE in the error message, so in case we decide to change it we only have to do it once?

} else if (strlen($this->buffer) < 2) {
// No CLRF found, wait for additional data which could be a CLRF
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh not really a fan of else and else if. Personally I prefer a structure like this:

if () {
    return;
}
if () {
    return;
}
if () {
    return;
}

@legionth
Copy link
Contributor Author

Ping @WyriHaximus . Changed the code based on your remarks.

@@ -139,11 +141,15 @@ public function handleData($data)
$this->headerCompleted = false;
$this->transferredSize = 0;
$this->buffer = (string)substr($this->buffer, 2);
} else if ($this->chunkSize === $this->transferredSize && strlen($this->buffer) > 2) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind my comment was to put a return here as well. The reasoning behind this is a) we're done after this point of code and there is no reason to continue running code in this function. And b) is lowers cognitive load on developers reading and working on this code.

In case you're interested in these little code readability optimizations I would suggest https://www.youtube.com/watch?v=GtB5DAfOWMQ if you haven't seen it. If you have my apologies 😄

@WyriHaximus
Copy link
Member

@legionth Splendid work 👍 . As far as I'm concerned my last comment is not a reason to not merge this, unless @jsor or @clue have any more comments this can be merged in once 0.5.0 has been tagged :shipit:

@clue clue modified the milestones: v0.6.0, v0.5.1 Feb 16, 2017
@clue
Copy link
Member

clue commented Feb 16, 2017

Functionally LGTM and I'd love to get this in ASAP :shipit:

Can you squash this to a reasonable number of commits? 👍

@legionth
Copy link
Contributor Author

legionth commented Feb 17, 2017

Squashed the commits. I hope this is a reasonable number of commits. Tell me if not.

@WyriHaximus
Copy link
Member

3 commits looks perfectly reasonable to me 👍

@clue
Copy link
Member

clue commented Feb 17, 2017

I'd like to get these changes in, can you rebase this now that #123 is in? :shipit:

Fix Endless loop

Fix

Add chunk size check and chunk extension handling

Handle potential test cases

Add ChunkedDecoder Tests

Handle potential threat

Rename variable

Added test to add verify single characters can be emitted

Fixing remarks

Use Mockbuilder
@legionth legionth force-pushed the chunk-encoding branch 2 times, most recently from 5dea067 to 4fd5d8f Compare February 19, 2017 14:12
Add ServerTest

Fix

Order
@legionth
Copy link
Contributor Author

Rebased on the current master :shipit:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support chunked transfer encoding
4 participants