Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove URL.domainToASCII and URL.domainToUnicode #63

Closed
annevk opened this issue Aug 16, 2015 · 14 comments
Closed

Remove URL.domainToASCII and URL.domainToUnicode #63

annevk opened this issue Aug 16, 2015 · 14 comments

Comments

@annevk
Copy link
Member

annevk commented Aug 16, 2015

They are still not implemented and it's no longer clear to me this is the best API. In particular:

  1. We might want an API around hosts in general. E.g., new URLHost(...) (the URL prefix for disambiguation).
  2. Converting an entire URL to Unicode might be something we want to cover too.

For both of these, overloading toString() to take an argument about how to serialize seems somewhat compelling, given the precedent in JavaScript. Though need to check with @domenic.

@domenic
Copy link
Member

domenic commented Aug 16, 2015

I dunno, I like the current APIs better than a new object. It's very simple: string to string. If URLHost actually had a bunch of useful behaviors, I could be convinced, I guess.

Overloading toString seems reasonable as long as there's a strong default.

@annevk
Copy link
Member Author

annevk commented Aug 16, 2015

The default would be ASCII, since that's what happens for URLs today. The other thing I was thinking of for URLHost would be public suffix stuff, but that could be done in a static too I suppose, would have to look into the API.

Perhaps the current situation is fine, and we just need to add URLUtils.prototype.toString({as:"unicode"}) or some such for serializing URLs.

@annevk
Copy link
Member Author

annevk commented Aug 16, 2015

Filed #64 for that. Public suffix is https://www.w3.org/Bugs/Public/show_bug.cgi?id=25865 but whether that will ever become an API is still unclear. I guess I should work on the underlying infrastructure first anyway.

@annevk
Copy link
Member Author

annevk commented Oct 19, 2016

Going to remove these for real now as they still haven't been implemented. There might be utility in "display URL" APIs once there's a bit more interest from implementers.

@annevk annevk reopened this Oct 19, 2016
@annevk annevk closed this as completed in 2bd0f59 Oct 19, 2016
@indolering
Copy link

This is frustrating because implementing this in Javascript support requires large tables and complex rules for proper Bidi support.

@achristensen07
Copy link
Collaborator

I wouldn't be opposed to implementing something in WebKit. Right now the URL object can be used to effectively access something like ICU's uidna_name_toASCII from JavaScript, but there's no way to access anything like ICU's uidna_nameToUnicode from JavaScript. It would be trivial to add, it wouldn't increase browser binary size, but it would greatly reduce content size of projects that really want to do this, like indolering seems to.

@annevk
Copy link
Member Author

annevk commented Mar 14, 2017

I recommend filing a new issue. It'd be great to know what we need to expose around hosts. Do we need to expose IP-address parsing for instance? I've heard requests for exposing various comparison operations: is A a subdomain of B and such. One time we got a request to expose public suffix in some way since applications might also have to use that information in their UI.

Then there's the question of whether we should expose ToUnicode or whether we should expose the browser UI version of ToUnicode (which doesn't always display Unicode, depending on the browser).

@indolering
Copy link

Right now the URL object can be used to effectively access something like ICU's uidna_name_toASCII from JavaScript, but there's no way to access anything like ICU's uidna_nameToUnicode from JavaScript.

It's not just me, you HAVE to run it through DNS labels through uidna_name_toASCII to figure if they are valid and every DNS server stores internationalized domain names in punycode. It's just that most people have settled for the incomplete Punycode.js library. This is bad because Punycode.js will accept invalid input and fail to map characters correctly. There is a library that claims to support everything but BIDI but it hasn't seen an update in two years.

I recommend filing a new issue. It'd be great to know what we need to expose around hosts.

While I personally would like some of those features, wouldn't they be an implementation hazard? I think the current proposal is just fine...

To me, it feels like this is another neglected i11n issue, which is sad since it sounds like exposing this functionality directly would be easy and require little maintenance burden. FWIW, Node implemented this proposal and they removed the JS library because the ICU implementation is ~10x faster.

At any rate: #274

@annevk
Copy link
Member Author

annevk commented Mar 15, 2017

@indolering thanks. The main problem we had last time around was that nobody implemented it, but it seems we can at least count on WebKit and maybe that is enough to convince others. Having a decent test suite would probably help too. IDNA parsing is rather underdeveloped still.

@indolering
Copy link

@annevk

It sounds like Node is willing to expose the ICU interface, so perhaps it would be easy to port to Blink?

@sleevi
Copy link

sleevi commented Mar 15, 2017 via email

@TimothyGu
Copy link
Member

@indolering Node.js also allows building a smaller version that does not have ICU enabled (--with-intl=none), though the default builds have them enabled.

@indolering
Copy link

Correction, Node.js has implemented it. Looks like a ~50 line commit.

@armanbilge
Copy link

armanbilge commented Jan 2, 2022

I raised some questions about the Node.js implementation of these methods in nodejs/node#41343 (linked above). If anyone could take a look at it, I'd be much obliged—seeing as these methods weren't really implemented anywhere else there doesn't seem to be any other implementation to compare against. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants