Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong answer on "\u001fx", apparently #5

Closed
domenic opened this issue Jan 6, 2017 · 2 comments
Closed

Wrong answer on "\u001fx", apparently #5

domenic opened this issue Jan 6, 2017 · 2 comments

Comments

@domenic
Copy link
Member

domenic commented Jan 6, 2017

Continued from web-platform-tests/wpt#4504 (comment) . Repro:

"use strict";
const tr46 = require(".");
const util = require("util");

const input = "\u001fx";
const output = tr46.toASCII(input, false, tr46.PROCESSING_OPTIONS.TRANSITIONAL, false);
console.log(util.inspect(output));

outputs 'xn--\u001fx-' whereas it should output just '\u001fx' according to @annevk's reading in web-platform-tests/wpt#4504 (comment) . Updates to follow...

@domenic
Copy link
Member Author

domenic commented Jan 6, 2017

The problem is this line in the spec:

Convert each label with non-ASCII characters into Punycode [RFC3492], and prefix by “xn--”. This may record an error.

which we translate as:

  labels = labels.map(function(l) {
    try {
      return punycode.toASCII(l);
    } catch(e) {
      result.error = true;
      return l;
    }
  });

It appears we are making an assumption that punycode.toASCII will be a no-op for ASCII characters (including \u001F). But it's not; the result of punycode.toASCII("\u001fx") is "xn--\u001fx-". It's not clear to me yet whether this is a punycode bug or if we should be doing the non-ASCII test ourselves. That is, it's not clear to me what the intended semantics of punycode.toASCII are.

@domenic
Copy link
Member Author

domenic commented Jan 6, 2017

Same bug in another tr46 library, at least according to source inspection: https://github.com/jcranmer/idna-uts46/blob/master/uts46.js#L92

It at least seems people are assuming that punycode.toASCII is a no-op on ASCII input.

domenic added a commit to domenic/tr46.js that referenced this issue Jan 6, 2017
Previously we used punycode's toASCII and toUnicode exports, which implement incomplete parts of TR46 themselves (with buggy "is ASCII" checks until recently; see jsdom#5 and mathiasbynens/punycode.js#59. Now we usethe lower-level encode and decode exports, with appropriate tweaks to the surrounding code to more fully conform to the surrounding TR46 algorithm steps.

Fixes jsdom#5.
domenic added a commit to domenic/tr46.js that referenced this issue Jan 8, 2017
Previously we used punycode's toASCII and toUnicode exports, which implement incomplete parts of TR46 themselves (with buggy "is ASCII" checks until recently; see jsdom#5 and mathiasbynens/punycode.js#59. Now we usethe lower-level encode and decode exports, with appropriate tweaks to the surrounding code to more fully conform to the surrounding TR46 algorithm steps.

Fixes jsdom#5.
domenic added a commit to domenic/tr46.js that referenced this issue Jan 8, 2017
Previously we used punycode's toASCII and toUnicode exports, which implement incomplete parts of TR46 themselves (with buggy "is ASCII" checks until recently; see jsdom#5 and mathiasbynens/punycode.js#59. Now we usethe lower-level encode and decode exports, with appropriate tweaks to the surrounding code to more fully conform to the surrounding TR46 algorithm steps.

This also upgrades to the latest punycode.js published on npm, instead of using the now-deprecated one bundled with Node.js.

Fixes jsdom#5.
@domenic domenic closed this as completed in #7 Jan 9, 2017
domenic added a commit that referenced this issue Jan 9, 2017
Previously we used punycode's toASCII and toUnicode exports, which implement incomplete parts of TR46 themselves (with buggy "is ASCII" checks until recently; see #5 and mathiasbynens/punycode.js#59. Now we usethe lower-level encode and decode exports, with appropriate tweaks to the surrounding code to more fully conform to the surrounding TR46 algorithm steps.

This also upgrades to the latest punycode.js published on npm, instead of using the now-deprecated one bundled with Node.js.

Fixes #5.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant