Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emoji is incorrectly encoded in punycode #368

Open
xPaw opened this issue Apr 21, 2018 · 6 comments
Open

Emoji is incorrectly encoded in punycode #368

xPaw opened this issue Apr 21, 2018 · 6 comments

Comments

@xPaw
Copy link

xPaw commented Apr 21, 2018

new URI("https://🤦‍♂️.xpaw.me").normalize().hostname()
> "xn--1ug66vku9rd58h.xpaw.me"

Unicode inspector: https://apps.timwhitlock.info/unicode/inspect?s=%F0%9F%A4%A6%E2%80%8D%E2%99%82%EF%B8%8F

However Chrome and https://www.punycoder.com/ encode it as https://xn--g5hz781o.xpaw.me/

What's happening here?

@xPaw
Copy link
Author

xPaw commented Apr 21, 2018

Chrome and Edge drop ZERO WIDTH JOINER and VARIATION SELECTOR-16 from the punycode which ends up as xn--g5hz781o.

Firefox only drops ZWJ which ends up xn--1ug66v4685b.

Looking at this: https://tools.ietf.org/html/rfc5894#section-7.2.2 dropping ZWJ is correct, however there's no word about variation selectors.

@rodneyrehm
Copy link
Member

Unfortunately I have no idea how emojis in domains should behave.

We could try updating punycode to 1.4.1, currently we're using 1.4.0. unfortunately 2.0.0 seems to have dropped legacy browser support.

@xPaw
Copy link
Author

xPaw commented Apr 21, 2018

It basically seems that IDNA rules should be followed before the domain is turned into punycode - https://unicode.org/reports/tr46/

I have a test page on https://xn--g5hz781o.xpaw.me/ which I did to test various browsers.

punycode.js doesn't seem to implement it sadly:

There is https://github.com/jcranmer/idna-uts46 which could probably solve the problem here, but that library is crazy big.

@rodneyrehm
Copy link
Member

maybe @mathiasbynens has thoughts on this?

@n4ru
Copy link

n4ru commented Nov 15, 2021

Chrome and Edge drop ZERO WIDTH JOINER and VARIATION SELECTOR-16 from the punycode which ends up as xn--g5hz781o.

Firefox only drops ZWJ which ends up xn--1ug66v4685b.

Looking at this: https://tools.ietf.org/html/rfc5894#section-7.2.2 dropping ZWJ is correct, however there's no word about variation selectors.

Was there a conclusion regarding whether or not variation selectors should be dropped?

@jarthod
Copy link

jarthod commented Dec 6, 2022

For the record, the latest idnaMappingTable (Unicode v15) seems to say the variation selectors should be ignored/dropped:

FE00..FE0F    ; ignored                                # 3.2  VARIATION SELECTOR-1..VARIATION SELECTOR-16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants