Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL: Add a way to set a default schema #2311

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Geod24
Copy link
Contributor

@Geod24 Geod24 commented May 27, 2019

One very common need is to parse a user-provided struct into an URL.
It is currently quite tedious to do, as one of the requirement is to have a schema.
This make it slightly easier for the common and simple cases.

@s-ludwig
Copy link
Member

s-ludwig commented Jun 4, 2019

This approach has a similar issue as #2308, where just a host name is accepted as a valid URL (which it isn't). In conjunction with the explicit default schema, it would make sense to accept "//host" as a protocol-relative URL, though, as that would allow to keep the current definition of URL as a non-relative URL.

Maybe it makes sense to add a parseFuzzyURL(str, default_schema, default_host, default_port) that allows interpreting URLs similar to a web browser. It could also allow relative paths with a base_path argument.

@Geod24 Geod24 changed the title URL: Add a way to set a default scheme URL: Add a way to set a default schema Jun 26, 2019
One very common need is to parse a user-provided struct into an URL.
It is currently quite tedious to do, as one of the requirement is to have a schema.
This make it slightly easier for the common and simple cases.
@Geod24
Copy link
Contributor Author

Geod24 commented Sep 15, 2021

@s-ludwig : I was looking at this again and got a bit puzzled.

You mentioned:

where just a host name is accepted as a valid URL (which it isn't).

What definition of URL are we using ? I assume this one ?

@s-ludwig
Copy link
Member

In this case, I'd refer to RFC1738. For schemas that do not fall in that category, something like this would be valid, though: foo:bar.

@s-ludwig
Copy link
Member

Yeah, but the one you mentioned would be what I'd associate with parseFuzzyURL (or whatever it would be called).

@Geod24
Copy link
Contributor Author

Geod24 commented Sep 15, 2021

But RFC1738 is obsolete (although there's probably a lot of overlap)... The RFC tracker doesn't make this very obvious, but AFAIK the "proper" RFC for would be RFC 3986. That's also the one mentioned in the above document.

@Geod24
Copy link
Contributor Author

Geod24 commented Sep 15, 2021

Note the goals section.

@s-ludwig
Copy link
Member

But RFC1738 is obsolete (although there's probably a lot of overlap)... The RFC tracker doesn't make this very obvious, but AFAIK the "proper" RFC for would be RFC 3986. That's also the one mentioned in the above document.

I think this is slightly besides the point. RFC 3986 is a revision of the original generic URI definition (RFC 2396), which in turn generalizes RFC 1738. So it touches parts of RFC 1738, but the more specific parts that define the "Common Internet Scheme Syntax" are still relevant for the schemas that are modeled that way (e.g. http(s)), although there are changes to the gopher and telnet schemas mentioned (RFC 4266 and RFC 4248).

So in essence, the generic URL parser should only assume generic URL syntax, as defined by RFC 3986, but for known URL schemas, it should perform proper schema specific validation and extended parsing (host, port, user/password). So the basic parser to do what https://url.spec.whatwg.org/#urls describes as "valid", while parseFuzzyURL would be the whole thing ("output" in that table).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants