Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Forbidden host code point" should be defined without U+0025 #319

Closed
GPHemsley opened this issue Jun 3, 2017 · 3 comments
Closed

"Forbidden host code point" should be defined without U+0025 #319

GPHemsley opened this issue Jun 3, 2017 · 3 comments
Labels
clarification Standard could be clearer good first issue Ideal for someone new to a WHATWG standard or software project

Comments

@GPHemsley
Copy link
Member

The term "forbidden host code point" is used exactly twice within this spec, and in one case it excludes U+0025 (%):

If asciiDomain contains a forbidden host code point, validation error, return failure.

If input contains a forbidden host code point excluding U+0025 (%), validation error, return failure.

I think it is easier for implementations to add to a definition rather than take away from it, so I propose removing U+0025 from the definition and special-casing it where it is included:

If asciiDomain contains a forbidden host code point or U+0025 (%), validation error, return failure.

If input contains a forbidden host code point, validation error, return failure.

Or else creating a second definition that relates to the existing one.

@annevk annevk added clarification Standard could be clearer good first issue Ideal for someone new to a WHATWG standard or software project and removed non-normative labels Apr 26, 2020
@annevk
Copy link
Member

annevk commented Mar 3, 2021

It's also used in https://url.spec.whatwg.org/#valid-opaque-host-string now and I think that indeed argues for changing this as it's a bit ambiguous (to me).

@alwinb
Copy link
Contributor

alwinb commented Nov 14, 2021

Something similar happens with URL-path-segment strings which uses the phrase:

zero or more URL units excluding U+002F (/) and U+003F (?), that together are not a single-dot path segment or a double-dot path segment.

What do you think about rephrasing things to use sets of allowed codepoints for each of username, password, opaque-host, path-segment, query, and fragment instead?

@annevk
Copy link
Member

annevk commented Nov 15, 2021

That might be reasonable, though I'd rather approach it in small incremental steps as each of those has the potential for regressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Standard could be clearer good first issue Ideal for someone new to a WHATWG standard or software project
Development

No branches or pull requests

4 participants