-
-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decode chunked transfer encoding for incoming requests #116
Conversation
20ab519
to
c830685
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! This seems much more approachable than the previous PR, only added some minor remarks afaict 👍
src/ChunkedDecoder.php
Outdated
if (strpos($header, ';') !== false) { | ||
$array = explode(';', $header); | ||
$hexValue = $array[0]; | ||
$start = strlen($header) + 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like $start
may always be $positionClrf + 2
, so this var may not be needed at all?
src/ChunkedDecoder.php
Outdated
$start = strlen($header) + 2; | ||
} | ||
|
||
if (dechex(hexdec($hexValue)) !== $hexValue || hexdec($hexValue) > 2147483647) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to get negative numbers here? Is the additional bound check really needed?
src/ChunkedDecoder.php
Outdated
return; | ||
} | ||
|
||
$this->chunkSize = hexdec($hexValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but can probably moved up as to avoid some function calls? 👍
src/ChunkedDecoder.php
Outdated
|
||
if ($positionClrf === false) { | ||
// Header shouldn't be bigger than 1024 bytes | ||
if (strlen($this->buffer) > 1024) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but can probably also use isset($this->buffer[$max])
to avoid some function calls? 👍
tests/ChunkedDecoderTest.php
Outdated
$this->parser->on('end', $this->expectCallableNever()); | ||
$this->parser->on('error', $this->expectCallableNever()); | ||
|
||
$this->input->emit('data', array("7FFFFFFE\r\n")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a 32bit assumption here? (see also above)
src/ChunkedDecoder.php
Outdated
if ($this->transferredSize > $this->chunkSize) { | ||
$this->handleError(new \Exception('The chunk is bigger than expected')); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand this check, should this not be limited by the above substr()
already? (see also below CRLF check)
src/ChunkedDecoder.php
Outdated
$this->buffer = (string)substr($this->buffer, strlen($chunk)); | ||
} | ||
|
||
if (strpos($this->buffer, static::CRLF) !== false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checks for any CRLF? What if the buffer is empty/incomplete here? Shouldn't this have to buffer and/or reject if there's not a CRLF at the head position instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this resolved? This still checks for any position afaict? 4\r\ntestWOOT\r\n
src/ChunkedDecoder.php
Outdated
{ | ||
$this->emit('error', array($e)); | ||
$this->input->removeListener('data', array($this, 'handleData')); | ||
$this->close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't closing already remove all listeners? (see also below)
I think, I have handled all the remarks @clue had. Have a look. |
src/ChunkedDecoder.php
Outdated
$this->input->removeListener('data', array($this, 'handleData')); | ||
$this->input->removeListener('end', array($this, 'handleEnd')); | ||
$this->input->removeListener('error', array($this, 'handleError')); | ||
$this->input->removeListener('close', array($this, 'close')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unneeded? Closing the input stream should remove all listeners.
src/ChunkedDecoder.php
Outdated
if ($positionClrf === false) { | ||
// Header shouldn't be bigger than 1024 bytes | ||
if (isset($this->buffer[1024])) { | ||
$this->handleError(new \Exception('Chunk size inclusive extension bigger than 1024 bytes')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Message should refer to size of "chunk header"?
src/ChunkedDecoder.php
Outdated
|
||
if ($positionClrf === false) { | ||
// Header shouldn't be bigger than 1024 bytes | ||
if (isset($this->buffer[1024])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
magic constant?
src/ChunkedDecoder.php
Outdated
} | ||
|
||
$this->chunkSize = hexdec($hexValue); | ||
if (dechex($this->chunkSize) !== $hexValue || $hexValue < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not seeing a test for $hexValue < 0
? Is this needed? Documentation suggests otherwise
src/ChunkedDecoder.php
Outdated
$this->buffer = (string)substr($this->buffer, strlen($chunk)); | ||
} | ||
|
||
if (strpos($this->buffer, static::CRLF) !== false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this resolved? This still checks for any position afaict? 4\r\ntestWOOT\r\n
src/ChunkedDecoder.php
Outdated
if ($this->headerCompleted) { | ||
if (strlen($this->buffer) > 2 && $this->chunkSize === $this->transferredSize) { | ||
// Send error event, the first 2 characters should be CLRF | ||
$this->handleError(new \Exception('Chunk does not end with a CLRF')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I'm not missing something, but where do we actually check "the first 2 characters should be CLRF"? (see also above)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment the previous comment, but: Yes 4\r\ntestWOOT\r\n
is handled. See tests testNoCrlfInChunk
and testNoCrlfInChunkSplitted
in ChunkedDecoderTests
.
To this comment. This has to be seen in relation to the previous if-statements. If the previous statement couldn't find a CLRF, this is still buffered. But only if the current string length of the buffer isn't >2. That would mean that the data in the buffer isn't correct chunked encoding.
src/ChunkedDecoder.php
Outdated
// Send error event, the first 2 characters should be CLRF | ||
$this->handleError(new \Exception('Chunk does not end with a CLRF')); | ||
} | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed? Doesn't the next iteration check this? Perhaps this is just a bit unclear and could use some comments?
tests/ServerTest.php
Outdated
$data .= "2\r\nhi\r\n"; | ||
|
||
$this->connection->emit('data', array($data)); | ||
$this->assertEquals('hello', $buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can probably use expectCallableOnceWith('hello');
here? Also, can you add tests for the end
, error
and close
events as well? (See also other test cases)
The test case for this were valid because the header check have failed for this. Another test like With the newest commits. This should be fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! 🎉
Hate being that fussy in review, but I believe these results show it's well worth it, appreciate your endurance 👍
For the reference: The changes LGTM, but we've had to move this to another milestone and we're going to release the v0.5.0 first, so we won't be able to merge this immediately 👍 |
src/ChunkedDecoder.php
Outdated
class ChunkedDecoder extends EventEmitter implements ReadableStreamInterface | ||
{ | ||
const CRLF = "\r\n"; | ||
const MAX = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably be more descriptive, eg. MAX_CHUNK_HEADER_SIZE
Thank you for your input @jsor. Changed the variable name 😄 |
@legionth Just skimmed over the test, and something I've done on react/http-client is run a valid chunked encoded body per character through the decoder to ensure it is valid no matter how slow and in what kind of chunks it comes in: https://github.com/reactphp/http-client/blob/master/tests/DecodeChunkedStreamTest.php#L36 is it an idea to do that here as well? |
@WyriHaximus good idea 👍 . Added an tests, have a look. |
src/ChunkedDecoder.php
Outdated
public function handleEnd() | ||
{ | ||
if (!$this->closed) { | ||
$this->handleError(new \Exception('Unexpected `end` event')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these
` intended here?
src/ChunkedDecoder.php
Outdated
if ($positionClrf === false) { | ||
// Header shouldn't be bigger than 1024 bytes | ||
if (isset($this->buffer[static::MAX_CHUNK_HEADER_SIZE])) { | ||
$this->handleError(new \Exception('Chunk header size inclusive extension bigger than 1024 bytes')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use static::MAX_CHUNK_HEADER_SIZE
in the error message, so in case we decide to change it we only have to do it once?
src/ChunkedDecoder.php
Outdated
} else if (strlen($this->buffer) < 2) { | ||
// No CLRF found, wait for additional data which could be a CLRF | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbh not really a fan of else
and else if
. Personally I prefer a structure like this:
if () {
return;
}
if () {
return;
}
if () {
return;
}
Ping @WyriHaximus . Changed the code based on your remarks. |
src/ChunkedDecoder.php
Outdated
@@ -139,11 +141,15 @@ public function handleData($data) | |||
$this->headerCompleted = false; | |||
$this->transferredSize = 0; | |||
$this->buffer = (string)substr($this->buffer, 2); | |||
} else if ($this->chunkSize === $this->transferredSize && strlen($this->buffer) > 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea behind my comment was to put a return here as well. The reasoning behind this is a) we're done after this point of code and there is no reason to continue running code in this function. And b) is lowers cognitive load on developers reading and working on this code.
In case you're interested in these little code readability optimizations I would suggest https://www.youtube.com/watch?v=GtB5DAfOWMQ if you haven't seen it. If you have my apologies 😄
Functionally LGTM and I'd love to get this in ASAP Can you squash this to a reasonable number of commits? 👍 |
dc57a12
to
17d5df8
Compare
Squashed the commits. I hope this is a reasonable number of commits. Tell me if not. |
3 commits looks perfectly reasonable to me 👍 |
I'd like to get these changes in, can you rebase this now that #123 is in? |
Fix Endless loop Fix Add chunk size check and chunk extension handling Handle potential test cases Add ChunkedDecoder Tests Handle potential threat Rename variable Added test to add verify single characters can be emitted Fixing remarks Use Mockbuilder
5dea067
to
4fd5d8f
Compare
Add ServerTest Fix Order
4fd5d8f
to
61d7b69
Compare
Rebased on the current master |
This ensures that only decoded body data will be emitted via the request object.
Resolves / closes #96