"D:\foo" should be parsed as "file:///D:/foo" #271

domenic · 2017-03-10T18:51:50Z

https://quuz.org/url/liveview.html#D:/foo Edge and Chrome on Windows at least parse this as a file URL, which I think is much more friendly. Firefox does not, but has some special logic so that when you enter D:\foo in the URL bar, it translates it to file:///D:/foo.

They also parse https://quuz.org/url/liveview.html#D:b/foo as a file URL, so it's not about the path name starting with /... maybe they treat all single-character schemes this way?

Discovered in nodejs/node-eps#51 (comment) by @jkrems

The text was updated successfully, but these errors were encountered:

annevk · 2017-03-15T08:45:08Z

For the record, the address bar is out-of-scope.

I guess allowing this basically means giving up on single-code-point schemes, indeed. Not sure what the right trade-off is there.

On the upside no such schemes are registered at http://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml but nothing is currently prohibiting that either.

zcorpan · 2017-03-15T11:21:23Z

httparchive

SELECT * FROM (
SELECT page, url, REGEXP_EXTRACT(LOWER(body), r'(<[a-z][^>]+\s(?:src|href)\s*=\s*["\']?[a-z]:/[^>]+>)') AS match
FROM [httparchive:har.2017_01_15_chrome_requests_bodies]
WHERE page = url
) WHERE match != "null"

Row	page	url	match	 
1	http://www.xm-n-tax.gov.cn/	http://www.xm-n-tax.gov.cn/	<img src="d:/piaochuang/piaochuang.jpg" width="150px" height="90px;" onclick="javascript:window.open('/content/n4676.html');"/>

Page has changed.

2	http://www.newsforshoppers.com/	http://www.newsforshoppers.com/	<link href="s://plus.google.com/102103991664781080361" rel="publisher" />

rel="publisher" has no effect for browsers

3	http://www.aaai.org/	http://www.aaai.org/	<script src=s://seal.verisign.com/getseal?host_name=www.aaai.org&size=s&use_flash=no&use_transparent=no&lang=en>	 
4	http://www.mathematichka.ru/	http://www.mathematichka.ru/	<base href="d:/mathematichka/web/">

These are commented out.

Possibly there is content such as documentation on CDs that rely on this? Maybe a use counter could help?

annevk · 2017-03-15T16:36:11Z

Well, the URL parser should be generally applicable ideally, also beyond browsers. Part of the reason we're doing this is so that non-browsers can still browse the web.

zcorpan · 2017-03-16T09:40:43Z

Sure, I was just trying to find out if there were strong compat reasons for browsers to behave one way or the other for such URLs. I think there isn't, for publicly-accessible web content at least.

annevk · 2017-03-22T13:06:56Z

Actually, we could maybe support this by branching on the backslash, which is normally non-conforming and doesn't occur in the examples above.

zcorpan · 2017-03-22T13:41:24Z

Oops, the query only looked for forward slash. New query. Also removed the WHERE page = url which was limiting to top-level resources.

SELECT * FROM (
SELECT page, url, REGEXP_EXTRACT(LOWER(body), r'(<[a-z][^>]+\s(?:src|href)\s*=\s*["\']?[a-z]:[/\\][^>]+>)') AS match
FROM [httparchive:har.2017_01_15_chrome_requests_bodies]
) WHERE match != "null"

22 rows. https://gist.github.com/zcorpan/98a61be4877858d3de18c19d8939a3be

annevk · 2017-03-22T13:54:20Z

Looks mostly like errors (and stuff that won't work since we don't want http -> file to do anything but network error), but also all of those with backslash expect the behavior OP asks for I think.

annevk · 2020-05-10T16:05:43Z

I confirmed that this is a quirk IE6+/Chrome (on Windows only) have. They do it for both d:/foo and d:\foo. In fact, they do it for any a-z scheme. IE6 also does it for a 0-9 or -/+ scheme; I'll consider those to be bugs. (Firefox's address bar quirk is only with a backslash, not a forward slash.)

Thoughts on only adopting this when a backslash is used? Or should we add a platform-specific quirk here similar to https://w3c.github.io/FileAPI/#convert-line-endings-to-native and make single-scheme URLs impossible forever on that platform?

cc @sleevi @valenting @achristensen07 @jasnell

domenic · 2020-05-10T16:35:08Z

I'm -1 on platform-specific behavior (seems especially bad in contexts like HTTP servers and proxies).

I'm neutral on treating backslash specially vs. just treating all single-letter schemes as drive letters.

I'm +1 on addressing this in general. It would be great if full Windows file paths can be parsed as URLs as simply as passing them to the URL constructor.

annevk · 2024-12-03T10:30:36Z

@hayatoito thoughts on this?

I would be okay with treating [a-zA-Z]:\ in a special way, in particular because \ is invalid when not percent-encoded, but I would strongly prefer not doing this for [a-zA-Z]:/, as it seems like too much of a breaking change from expected URL semantics.

annevk added the topic: parser label Mar 11, 2017

annevk mentioned this issue Mar 22, 2017

002 - web compat of file extension/index searching nodejs/node-eps#51

Closed

GPHemsley mentioned this issue Jul 4, 2017

Add tests to check for bad URL bases. web-platform-tests/wpt#6446

Merged

domenic mentioned this issue Sep 16, 2020

Review slashes after file:/// and file path normalization #405

Closed

annevk added the topic: file Aren't file: URLs the best? label Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"D:\foo" should be parsed as "file:///D:/foo" #271

"D:\foo" should be parsed as "file:///D:/foo" #271

domenic commented Mar 10, 2017 •

edited

Loading

annevk commented Mar 15, 2017

zcorpan commented Mar 15, 2017

annevk commented Mar 15, 2017

zcorpan commented Mar 16, 2017

annevk commented Mar 22, 2017

zcorpan commented Mar 22, 2017

annevk commented Mar 22, 2017

annevk commented May 10, 2020 •

edited

Loading

domenic commented May 10, 2020

annevk commented Dec 3, 2024

"D:\foo" should be parsed as "file:///D:/foo" #271

"D:\foo" should be parsed as "file:///D:/foo" #271

Comments

domenic commented Mar 10, 2017 • edited Loading

annevk commented Mar 15, 2017

zcorpan commented Mar 15, 2017

annevk commented Mar 15, 2017

zcorpan commented Mar 16, 2017

annevk commented Mar 22, 2017

zcorpan commented Mar 22, 2017

annevk commented Mar 22, 2017

annevk commented May 10, 2020 • edited Loading

domenic commented May 10, 2020

annevk commented Dec 3, 2024

domenic commented Mar 10, 2017 •

edited

Loading

annevk commented May 10, 2020 •

edited

Loading