-
-
Notifications
You must be signed in to change notification settings - Fork 724
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HttpParser] New component #6
Comments
I'm thinking of doing something similar for Guzzle, but not as error-tolerant as these appear to be. Basically I want to move all of the parses in Guzzle (cookies, URLs, requests, responses) into the Guzzle\Parser namespace and provide interfaces so that people can use different parsing implementations (e.g. pecl_http). You would likely get a speed boost by using a native C implementation. For the sake of speed, the parsers I implement for Guzzle in PHP will be similar to what's already in Guzzle, so it might not be exactly what you're looking for. So-- I'm still thinking about adding these interfaces and making the default Guzzle implementations, but if you implement parsers in react, then I can ship Guzzle with a built in adapter for react adapters (I could call it the Inception adapter :) ). I'm also evaluating the possibility of subtree splits for Guzzle so that it would be easier to use different components. I'd love any pointers you have on how you did that for react. |
The subtree split script is in the repo: https://github.com/react-php/react/blob/master/subtree-split.sh. All it requires is |
Sweet! Thanks! |
After quite a bit of refactoring, I added support for different parsing implementations to Guzzle. There are parsers for messages, cookies, and URLs (default is PHP's parse_url()). There's support for a pecl_http message parser, and it's actually ~4.5x faster than the pure PHP parser. https://github.com/guzzle/guzzle/tree/master/src/Guzzle/Http/Parser Parsers are registered with Guzzle\Http\Parser\ParserRegistry and lazy loaded as requested. The ParserRegistry uses the PHP based versions by default. If a user is looking for performance, they can globally switch out any of the parsing implementations using this class. What do you think? Could it be improved? assuming I could subtree split the Guzzle\Http\Parsers namespace, is this something you could use? |
Impressive work. I took it for a brief spin and I liked it. I'll see about incorporating it into the React HTTP lib in the next few days. It's unfortunate the PECL lib is ~4.5x faster, because it's inaccurate: Given a header string of |
@cboden it's unclear to me the reason why the PECL lib manipulates them, but aren't HTTP header field names case-insensitive anyway? I believe the culprit is the _http_pretty_key function at line 73 in http_api.c |
You're right, but as nrk said, it shouldn't matter. I had to refactor a bit of code and a lot of tests in Guzzle so that headers are completely case-insensitive. Here's another example of the differences between the two parsers:
With Guzzle:
With Pecl:
|
@nrk Excellent point. When I wrote the WebSocket framing methods RFC6455 always wrote the examples as
That being said, I have no idea why the PECL lib would deliberately change the casing in the way it did. I'll update my code to be case insensitive. Thanks for the heads up. |
@cboden on a second thought after seeing @mtdowling's reply, I believe they did it to group repeated field names with potentially different casing into one using a comma-separated list of values (which is indeed allowed by the specs). The fact that they used an ucfirst approach for each part of a field name is probably just for the sake of the look it, as suggested by the name of the _http_pretty_key function. |
I started implementing @mtdowling's new Parser changes...as before, I was impressed with the PHP work by Michael...and disappointed by the PECL extension. Dropping in Given
It returned:
I figure with some tweaking we could check if While I'm ranting about Anyway, I like the new parser work, I'm happy to help contribute to it to get it to a stable point. A subtree split would be awesome. @igorw can weigh in, in about a week (busy prepping for conference). Final score:
|
Glad you got to take it for a spin, and I'd be happy to have any feedback or contributions. As for the pecl_http issue: It might be that pecl_http is not properly handling "\n" line endings (vs the expected \r\n). I'm finishing some heavy refactoring in Guzzle this week. After this refactoring, I expect the API to be very stable, so at that point, it would make sense to look into creating subtree splits for the following:
|
The unit test was written like |
Wow, that sucks. pecl_http seems to work for Guzzle, but it looks like it isn't something you can use. Maybe we still need an adapter for some of the HTTP parsers listed by @igorw. |
FYI section 4.2 of rfc2616 is where the spec states header field names are case-insensitive, may be folded using a comma separator, or continued (ie multi-line). Note: the cookie spec has an exception. Folding does not apply to Set-Cookie. |
The problem is that the case is sometimes violated (e.g. opera with WebSockets), so having control over this is kind of a big deal, especially when producing http messages. Great work with the parser component! A subtree split would be great. Then we can also contribute a mongrel parser patch into it. |
Also consider checking out #233. |
See reactphp/http#14 and reactphp/http#19 |
Port the mongrel http parser to PHP. Or maybe the node one.
Also provide adapters for PHP extensions:
The text was updated successfully, but these errors were encountered: