Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use verbatim paths by default #27916

Closed
wants to merge 1 commit into from
Closed

Conversation

tbu-
Copy link
Contributor

@tbu- tbu- commented Aug 20, 2015

This enables users of the Rust standard library to ignore the limit of 260
"characters" per file name – instead the new limit is around 32K "characters".

See also "Naming Files, Paths, and Namespaces" on MSDN, section "Maximum Path
Name Limitations":
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

In order to use verbatim paths, the following conversions are performed:

  • C:\ -> \\?\C:\.
  • \\server\share\ -> \\?\UNC\server\share\.
  • .. are evaluated by popping the last component.
  • . are dropped.
  • Relative paths are joined with os::current_dir (Special case: Path name is
    along the lines of C:foobar. This is a relative path if the current
    directory is on drive C:, otherwise it's an absolute path).

@rust-highfive
Copy link
Collaborator

r? @alexcrichton

(rust_highfive has picked a reviewer for you, use r? to override)

@tbu-
Copy link
Contributor Author

tbu- commented Aug 20, 2015

Example of a program that doesn't work before, but works afterwards:

use std::fs;
use std::iter;
use std::path::PathBuf;

fn main() {
    let mut path = PathBuf::new();
    let long_string: String = iter::repeat("a").take(128).collect();
    for i in 0..10 {
        path.push(&long_string);
        println!("creating {}", i);
        fs::create_dir(&path).unwrap();
    }
    for i in (0..10).rev() {
        println!("removing {}", i);
        fs::remove_dir(&path).unwrap();
        path.pop();
    }
}

This enables users of the Rust standard library to ignore the limit of 260
"characters" per file name – instead the new limit is around 32K "characters".

See also "Naming Files, Paths, and Namespaces" on MSDN, section "Maximum Path
Name Limitations":
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

In order to use verbatim paths, the following conversions are performed:
- `C:\` -> `\\?\C:\`.
- `\\server\share\` -> `\\?\UNC\server\share\`.
- `..` are evaluated by popping the last component.
- `.` are dropped.
- Relative paths are joined with `os::current_dir` (Special case: Path name is
  along the lines of `C:foobar`. This is a relative path if the current
  directory is on drive `C:`, otherwise it's an absolute path).
@tbu- tbu- force-pushed the pr_windows_paths branch from 8522103 to 7de520d Compare August 20, 2015 13:36
@alexcrichton
Copy link
Member

Out of curiosity, is there precedent for this in other I/O libraries? In general I think it's quite valuable that we perform 0 interpretation of the path coming into these APIs and allow the underlying system APIs to interpret them however they'd like. In the past any sort of layering we add on top inevitably breaks some obscure case in some API, so just passing everything through seems like the best option.

@tbu-
Copy link
Contributor Author

tbu- commented Aug 20, 2015

Out of curiosity, is there precedent for this in other I/O libraries?

I don't know – haven't looked it up, although I couldn't find something with a quick google search.

In general I think it's quite valuable that we perform 0 interpretation of the path coming into these APIs and allow the underlying system APIs to interpret them however they'd like. In the past any sort of layering we add on top inevitably breaks some obscure case in some API, so just passing everything through seems like the best option.

It's pretty hard to say something to that, as this is pretty hand-wavy – the pull request is not adding it to all functions, but rather it can be reviewed on a case-by-case basis for each function by looking at its documentation (EDIT: I guess one should look a bit at the link functions for this). I think we already do some kind of normalization already, namely stripping empty path components (?).

EDIT2: It'd also be valuable that you wouldn't have to care about Windows' quirks about path lengths in Rust programs.

@tbu-
Copy link
Contributor Author

tbu- commented Aug 24, 2015

Documentation of the functions, in order of appearance in the source code:
CreateFile
CreateDirectory
FindFirstFile
DeleteFile
MoveFileEx
RemoveDirectory
CreateSymbolicLink
CreateHardLink
GetFileAttributesEx
SetFileAttributes
CopyFileEx

All these functions (except for the link functions) are documented with:

In the ANSI version of this function, the name is limited to MAX_PATH characters. To extend this limit to 32,767 wide characters, call the Unicode version of the function and prepend "\?" to the path. For more information, see Naming Files, Paths, and Namespaces.

This means the official documentation recommends using these "\?" in order to get around the MAX_PATH limitation.

The CreateSoftLink function will need to be special cased to allow a strictly-relative path to pass through, as it's documented that these behave different. Non-strictly-relative paths, like C:foobar are still expanded using the current working directory though (as specified in the documentation).

@retep998
Copy link
Member

After doing some testing locally, it appears that .. in a path on Windows completely disregards symbolic links and just pops the last component. Also Windows is able to handle verbatim paths that have symbolic links in them just fine. As well I am unable to create files named . or .., even using verbatim paths. Therefore I am under the impression that this technique of simple text manipulation to strip . and .. should work perfectly fine. However, I do think this should be factored out into two public unstable methods, one to make a path absolute, and another to normalize a path.

@tbu-
Copy link
Contributor Author

tbu- commented Aug 24, 2015

@retep998 Yea, I also read that Windows just normalizes the paths before passing them to the file system.

I guess we can create public APIs once this has landed.

@alexcrichton
Copy link
Member

I agree that it's kinda annoying to deal with path length restrictions on Windows, but I'm pretty uneasy about forging ahead here without any prior art. These sorts of conversions have historically been always wrong in one way or another for us, hence our strategy in std::io of binding as close to the system as possible everywhere.

I think it may be best to develop this sort of functionality externally on crates.io first and see how it plays out before moving it into the standard library to happen by default.

@tbu-
Copy link
Contributor Author

tbu- commented Sep 5, 2015

Closing for now I guess (as a reminder, if this is to be adopted, fix the CreateSymbolicLink function).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants