Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Illegal characters in URL do not raise an exception #41976

Closed
GuillaumeBlanchet opened this issue Feb 14, 2022 · 1 comment
Closed

Illegal characters in URL do not raise an exception #41976

GuillaumeBlanchet opened this issue Feb 14, 2022 · 1 comment
Labels
whatwg-url Issues and PRs related to the WHATWG URL implementation. wrong repo Issues that should be opened in another repository.

Comments

@GuillaumeBlanchet
Copy link

GuillaumeBlanchet commented Feb 14, 2022

Version

v17.5.0

Platform

Linux deskt 5.13.0-28-generic #31~20.04.1-Ubuntu SMP Wed Jan 19 14:08:10 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Subsystem

No response

What steps will reproduce the bug?

Open a terminal and enter the following commands (first is optional if you have already node on your system):
$ docker run -i -t node:17.5 bash
:/# node -i
> new URL('http://sub_domain.domain.tld');

How often does it reproduce? Is there a required condition?

Happens all the time

What is the expected behavior?

I would expect to have an output like this:

Uncaught TypeError [ERR_INVALID_URL]: Invalid URL
at __node_internal_captureLargerStackTrace (node:internal/errors)
at new NodeError (node:internal/errors)
at onParseError (node:internal/url)
at new URL (node:internal/url) {
input: 'http://sub_domain.domain.tld',
code: 'ERR_INVALID_URL'
}

Since all ASCII characters except A-Z, a-z, 0-9 and U+002D ( - ) are illegals in domain names, see the RFC about LDH names (Letters, Digits, and Hyphens)

What do you see instead?

URL {
href: 'http://sub_domain.domain.tld/',
origin: 'http://sub_domain.domain.tld',
protocol: 'http:',
username: '',
password: '',
host: 'sub_domain.domain.tld',
hostname: 'sub_domain.domain.tld',
port: '',
pathname: '/',
search: '',
searchParams: URLSearchParams {},
hash: ''
}

Additional information

> new URL('http://sub+domain.domain.tld');
> new URL('http://sub&domain.domain.tld');
> new URL('http://sub?domain.domain.tld');

and so on, produce the same behavior, violating LDH rule for domain names. Please consider instantiate ICU always with the option: UIDNA_USE_STD3_RULES at line

if (mode == IDNA_STRICT) {
.

As a side note, the proper way to instantiate the ICU lib to support IDNA2008 is the following:

uint32_t options =                  // CheckHyphens = false; handled later
    UIDNA_CHECK_BIDI |                // CheckBidi = true
    UIDNA_CHECK_CONTEXTJ |            // CheckJoiners = true
    UIDNA_CHECK_CONTEXTO |  
    UIDNA_NONTRANSITIONAL_TO_ASCII |   // Nontransitional_Processing
    UIDNA_NONTRANSITIONAL_TO_UNICODE |
    UIDNA_USE_STD3_RULES);
  UIDNA* uidna = uidna_openUTS46(options, &status);
@aduh95
Copy link
Contributor

aduh95 commented Feb 14, 2022

Node.js follows the WHATWG URL spec, which does not forbids those characters it seems: https://url.spec.whatwg.org/#host-miscellaneous. This behavior is consistent with Firefox, Chromium, Safari, and Deno.

If you think this API should restrict the allowed characters, please open an issue on their repo: https://github.com/whatwg/url.

I'm going to close this now, but if you think I missed something don't hesitate to re-open or to ask more questions.

@aduh95 aduh95 closed this as completed Feb 14, 2022
@aduh95 aduh95 added whatwg-url Issues and PRs related to the WHATWG URL implementation. wrong repo Issues that should be opened in another repository. labels Feb 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
whatwg-url Issues and PRs related to the WHATWG URL implementation. wrong repo Issues that should be opened in another repository.
Projects
None yet
Development

No branches or pull requests

2 participants