Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing urls with missing protocol #187

Closed
lusy opened this issue Feb 10, 2015 · 2 comments
Closed

Parsing urls with missing protocol #187

lusy opened this issue Feb 10, 2015 · 2 comments

Comments

@lusy
Copy link

lusy commented Feb 10, 2015

Hi, I've just used URI.js for the first time and I was excited:

var urlParser = require('URIjs');

urlParser("http://example.com:8080/pathname/lala").protocol() --> http
urlParser("http://example.com:8080/pathname/lala").hostname() --> example.com
urlParser("http://example.com:8080/pathname/lala").port() --> 8080
urlParser("http://example.com:8080/pathname/lala").pathname() --> /pathname/lala

However, if I try it on a url where the protocol part is missing, everything gets messed up:

var urlParser = require('URIjs');

urlParser("example.com:8080/pathname/lala").protocol() --> example.com
urlParser("example.com:8080/pathname/lala").hostname() --> 
urlParser("example.com:8080/pathname/lala").port() --> 
urlParser("example.com:8080/pathname/lala").pathname() --> 8080/pathname/lala

Could we do support for parsing urls where the protocol part is missing? I think they come up pretty often.. Should I pull-request that? Or is it intetionally left out? Thanks!

@ooxi
Copy link

ooxi commented Feb 11, 2015

That will not be really possible since example.com:8080/pathname/lala is no valid URI. There is support for URIs without the scheme part in URI.js but in your example that would be ://example.com:8080/pathname/lala.

There have been discussions before of adding heuristics for such behaviour but it's generally very messy to operate on invalid input :(

@rodneyrehm
Copy link
Member

This behavior has caused confusion in the past and I fully expect this to raise more issues in the future. Welcome to the weird world of string semantics…

According to RFC 3986 the string example.com:8080/pathname/lala is to be considered an URN with the scheme example.com and the path 8080/pathname/lala, which URI.js is getting right. To treat this as an HTTP URL, either prepend the proper scheme, or use the scheme-relative notation presented by @ooxi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants