-
Notifications
You must be signed in to change notification settings - Fork 29.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[v12.x] http: fix monkey-patching of http_parser #30222
Conversation
Fixes inability to monkey-patch the HTTP Parser on Node v12.x. This is not an issue on Node v13+ since the parser was renamed back to the old (already monkey-patchable) `http_parser` name, however the test in master could also use updating. Fixes: creationix/http-parser-js#65
cbc4882
to
32bf7f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this okay, but understand that this kind of monkey-patching is generally not supported or encouraged…
Understood! However, since Node's HTTP parser historically throws process-killing exceptions, or at least fails to parse, when encountering slightly malformed HTTP, it's pretty much a requirement if you want to do any website scraping or indexing using Node (and still take advantage of the robust NPM ecosystem that obviously calls directly into |
@Jimbly Right – but in that case it would be much nicer to know what the parser fails to parse, and figure out a way to make that usable through the supported API. |
There's lots of stuff that the current parser fails to parse, but it fails to parse it because it is technically invalid HTTP. For security and stability reasons, I think it's probably important that Node's HTTP parser only accepts actually valid HTTP (at least when parsing requests - people almost entirely leverage |
@Jimbly Yeah, that lenient mode would probably be a good idea. Or to put it another way: Monkey-patching is simply not a viable long-term solution for this problem. This PR is a short-term workaround, and code will be broken when people hook into private, internal APIs. |
Of course! If someone wants to add a lenient mode to the parser in core node, I can happily gather up which of the tests fail on Node (basically anything added to that tree without a commit of "update for node v4", etc). But I certainly don't have the bandwidth to tackle adding it myself, nor would I want to go anywhere near trying to add complexity to code presumably in Node's critical performance path. |
I think that would be a huge help, if you’re willing to do that.
Yeah, that’s fine :) |
@addaleax Here ya go!: https://github.com/creationix/http-parser-js/blob/master/tests/README.md Seems the new Node HTTP parser gracefully handles a lot more things. At least, none of the tests are crashing the Node process anymore, and a few of the old tests now mostly work (bad headers turning into empty strings in the headers list, but I'm sure that's fine). Only two real issues (at least, that we have tests for - but usually we try to add a test if a user reports something) - dealing with ambiguous chunked encoding + content-length and users opting in to allowing headers larger than the old 80KB limit (now 8KB limit in Node v12+, it seems). |
Thank you! I’m not sure, but @indutny – would you be willing/able to help with some of the changes that would have to go into llhttp for a lenient mode? I’d probably be able to do that, but you already know your way around that codebase. (There’s at least one TODO commment in |
I dont mean to hijack this thread, but another case @Jimbly 's parser handles is http responses where the content-length header is duplicated, with the same value. I didn't see this case mentioned explicitly in the test notes above, so I thought I would mention it. Here is what the
Browsers seem to handle this fine. Node's http parser crashes with a code: I believe this case should be reasonable with the actual content length values are correct. A lenient mode would be greatly appreciated. I would love to help with this effort, if that is ok. I can imagine this mode is something I would enable on a case-by-case basis, not enabled the whole time. |
@bdurrani You're not hijacking at all, at least, no more than it already was, but that's mostly my fault =). I forgot about that case! I've added it to the |
@addaleax I'll be happy to help you. Feel free to send me questions over email, IRC, or on a github issue. |
Fixes inability to monkey-patch the HTTP Parser on Node v12.x. This is not an issue on Node v13+ since the parser was renamed back to the old (already monkey-patchable) `http_parser` name, however the test in master could also use updating. Fixes: creationix/http-parser-js#65 PR-URL: #30222 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Michaël Zasso <[email protected]>
landed in 54635f5 |
Fixes inability to monkey-patch the HTTP Parser on Node v12.x. This is not an issue on Node v13+ since the parser was renamed back to the old (already monkey-patchable) `http_parser` name, however the test in master could also use updating. Fixes: creationix/http-parser-js#65 PR-URL: #30222 Reviewed-By: Anna Henningsen <[email protected]> Reviewed-By: Michaël Zasso <[email protected]>
Fixes inability to monkey-patch the HTTP Parser on Node v12.x. This is not an issue on Node v13+ since the parser was renamed back to the old (already monkey-patchable)
http_parser
name, so I made this PR to the v12.x branch, but I'm unclear as to whether or not that's the right move. However the test in master could also use updating so this doesn't break again (I'll send a separate PR for that).Fixes: creationix/http-parser-js#65
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes