Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specifying additional "special" schemes. #749

Open
tmccombs opened this issue Feb 5, 2023 · 3 comments
Open

Allow specifying additional "special" schemes. #749

tmccombs opened this issue Feb 5, 2023 · 3 comments
Labels
clarification Standard could be clearer

Comments

@tmccombs
Copy link

tmccombs commented Feb 5, 2023

The parsing algorithm behaves differently for certain domains that are considered "special". In addition the scheme of a non-special URL cannot be changed to a special scheme. In some applications, especially non-web-browser applications, it is desirable for additional schemes to be treated the same way as the listed special schemes, and be able change the protocol/scheme to and from other special schemes.

I think there are a few ways this could be addressed:

  1. Change the API to allow passing a list of additional special schemes into the constructor for URL
  2. Change the API to allow specifying that a URL should be treated as a special url during construction
  3. Add a new URLFactory (or URLBuilder) class that allows configuring the set of special schemes for any URLs created with it.
  4. Do not specify any additional required API, but say that an implementation is allowed to treat additional schemes as special, and potentially include an API for registering additional special schemes.

Some examples of schemes that applications may wish to treat as special:

  • git
  • sftp
  • gopher
  • http+unix and https+unix or similar (in fact, maybe it would be worth specifying that an existing special scheme followed by a "+" and a suffix is also a special scheme?)
  • custom scheme intended for opening an http resource in a specific application

I follow the rust-url repository, which aims at implementing this specification, and issues related to this come up pretty frequently. For example:

Related issues for this repository:

@annevk
Copy link
Member

annevk commented Feb 6, 2023

I think the answer here is 5. It's worth clarifying in the standard that this is a non-goal, as it indeed occasionally comes up.

Instead what you'd do is define a processor that takes a URL and turns it into a data structure suitable for further usage. E.g., what we do in https://fetch.spec.whatwg.org/#data-urls for data: URLs. Such a scheme-specific processor can take care of adding a path, further processing an opaque host, etc.

The reason for that is that URL parsing ought to be stable over time and across implementations. Implementations should not have differing views as to what a URL string represents, how it serializes once parsed, etc. And if URLs are further processed ideally that aligns across implementations as well, but that will only happen in implementations purporting to support the scheme, which will be a subset.

@annevk annevk added the clarification Standard could be clearer label Feb 6, 2023
@tmccombs
Copy link
Author

tmccombs commented Feb 6, 2023

My point is that a subset of custom schemes are basically identical to http/https, but use a different scheme to convey some additional information. Such a separate processor would have to duplicate a lot of what the Url parser already implements.

@annevk
Copy link
Member

annevk commented Feb 6, 2023

Yeah, understood.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Standard could be clearer
Development

No branches or pull requests

2 participants