-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Are drive letters always invalid? #612
Comments
I don't think they should be invalid, except perhaps if Windows moved away from them. |
I'm not sure if that is correct. Of the 4 kinds of paths Windows supports (yes, really), one of them is DOS device paths, for example:
The latter 2 URLs certainly do have drives. The version with 2 slashes was used by legacy software that basically just stuck I don't think it's worth trying too hard to disambiguate drive letters from path components that look like drive letters. There are obviously some trade-offs that need to be accepted when trying to represent both Windows and POSIX paths in a single format. |
Thank you, this is useful information. I think I made mistakes reading the standard, so there are mistakes in my question (I think...). My expectation is that three slashes followed by a
I am interested in this, mostly because it is exactly what I am trying to do... So I should try to understand your point. |
You could work around it by percent-encoding the When creating a file URL from a file path, you already need to encode characters which the URL parser would otherwise interpret (e.g. |
Exactly, either that, or always prefixing the drive-letter-like component with |
Step 2.4.1 in the path state, suggests at least that both |
On further investigation, it appears that, while DOS device paths can semantically contain drive letters, they are not considered the path root when they are in such paths. That's what Microsoft's documentation says, and indeed, using the Windows API function Buuuuut, since we're talking about file paths, of course the situation is actually more complex than that. If the file URL has a hostname, and that hostname is not So basically: for most hostnames, we should always consider the share name as the root, which is semantically not a drive letter (i.e. shouldn't be normalized). Today, for every hostname, we only consider some share names to be the root, because they coincidentally look like drive letters, and we normalize them when really we shouldn't. As far as I can tell, neither Edge/Chrome nor IE actually handle UNC path roots correctly ( |
Looking at this again, I think that the writing section for file URLs still reflects the situation from before #405 and specifically #302, where (normalised/ resolved) file URLs that had a drive could not have a host. Which is great. Updating it to reflect the changes should simplify things, which is nice! A question is, if the drive letter should be included as a production in the grammar (the writing rules) for file URLs. I expected that to be the case, but it seems that they’re not distinguished from ordinary path components right now. |
In the writing section, in the definition of relative-URL string, in the
Can be replaced with:
|
Thanks for pointing that out. This text was written long ago and hasn't seen a lot of review. I think that also means we should update https://url.spec.whatwg.org/#scheme-relative-file-url-string to say something like
right? And that in turn means "path-absolute-non-Windows-file-URL string" can be removed. |
The Writing section suggests that drive letters are allowed in file URLs only if they do not have a host.
This is reflected in the parser (the file-slash state does not issue a validation warning on a drive letter, the other states do).
However, step 5.1 in the scheme state ensures that hostless file URLs are always invalid, so it is out of sync with the Writing section, and it suggests there cannot be a valid file URL with a drive letter.
Maybe the idea is that e.g.
file:/c:/etc/
has a drive, whereasfile:c:/etc/
,file://c:/etc/
andfile:///c:/etc/
do not. That makes sense as a way to disambiguate drive letters from path components that 'look like' drive letters. But the parser/resolver does treat thec:
part as a drive letter in all of them.The text was updated successfully, but these errors were encountered: