-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URL: test the host parser #4504
Conversation
Most of these fail in jsdom/whatwg-url too. Is this with a spec update that we haven't applied? It would be ideal if these could be added to urltestdata.json and setters_tests.json, as then other testing environments (whatwg-url, Node.js's parser) would easily be able to track them. The fact that Chrome and Firefox generally fail most of these tests is a bit worrying. |
This is the specification as-is. I also wanted to test various schemes, which would *n the number of tests. I guess I could attempt to output JSON, but seems weird without source script. |
Including a source script that outputs JSON would work great for us. It sounds like then we might have some serious bugs in the host parser in whatwg-url. I'll try to diagnose some time. |
Might have bugs, but Safari TP passes close to all. |
So here is an example. For Apparently our interpretation of
When applied to the string /cc @Sebmaster |
No idea, but http://www.unicode.org/reports/tr46/ has the answer I suppose. We need to use the mapping table. http://www.unicode.org/reports/tr46/#IDNA_Mapping_Table. Per http://www.unicode.org/Public/idna/latest/IdnaMappingTable.txt U+001F is disallowed_STD3_valid. So it's valid for our purposes since we pass UseSTD3ASCIIRules=false. Then we see in step 3 of http://www.unicode.org/reports/tr46/#ToASCII that applying Punycode processing is not appropriate here. mathiasbynens/punycode.js#12 might be part of the issue here? |
OK, tracked things down in jsdom/whatwg-url. After jsdom/tr46#7, the failures are:
(here "expected" = by this PR, "actual" = produced by jsdom/whatwg-url) Mind giving those three a closer look to see where the bug is, @annevk? |
The answer to the "-" question by the way is that apparently user agents (except Firefox) only enforce that requirement for ToASCII when the input contains non-ASCII code points. I'm not sure about the "." issue. As far as I can tell the IPv4 parser should return input in step 6.1. |
There's over 3000 IDNA tests, which times 9 makes for well over 30000 tests. They can load and execute in a browser (many will fail), but maybe there's a better way to do this? |
@jgraham I'm getting "The log length has exceeded the limit of 4 MB (this usually means that the test suite is raising the same exception over and over)." I guess I could split this up and instead run ~3000 tests per resource (instead of doing all APIs in one resource). Would that fit? |
61046fb
to
a41d3a4
Compare
So host-parser-reduced is somewhat useful in that it gives faster results, but there's so many IDNA tests (and browsers fail so many!) that it makes it hard to analyze things. |
One thing I did found out is that Safari still doesn't implement Nontransitional_Processing. |
This needs to be updated to use a newer IdnaTest.txt, it should use @zcorpan's work on splitting the test so we don't run too many tests at once, and the results need to be verified since it's unclear if IdnaTest.txt is a 100% accurate. Since I've created a bunch of simpler IDNA tests and user agents aren't even fixing those this simply hasn't become a priority. |
I am interested in picking these patches up and pushing them over the finish line. In particular, I noticed inconsistencies in how browsers parse domains, and I want these inconsistencies documented. For example: new URL('http://hello^/') // throws in Gecko and Blink; succeeds in EdgeHTML
new URL('http://hello</') // throws in Gecko; succeeds in EdgeHTML and Blink
|
@strager that's great, a new PR would be best I think. whatwg/url#459 is likely of interest given your examples. If you could start with writing tests for that PR that'd be terrific. And yeah, it doesn't matter if the tests fail, but when that happens we do ought to evaluate whether a change to https://url.spec.whatwg.org/ is warranted. |
I finally got around to creating a successor for |
See whatwg/url#159 for context.