-
-
Notifications
You must be signed in to change notification settings - Fork 30.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-99726: Adds os.statx function and associated constants #99755
Conversation
zooba
commented
Nov 24, 2022
•
edited by bedevere-bot
Loading
edited by bedevere-bot
- Issue: Add fast path to os.[l]stat() that returns incomplete information #99726
My TODO list:
|
Python/fileutils.c
Outdated
case FILE_DEVICE_DISK: | ||
case FILE_DEVICE_DISK_FILE_SYSTEM: | ||
case FILE_DEVICE_VIRTUAL_DISK: | ||
case FILE_DEVICE_DFS: | ||
case FILE_DEVICE_CD_ROM: | ||
case FILE_DEVICE_CD_ROM_FILE_SYSTEM: | ||
case FILE_DEVICE_CONTROLLER: | ||
case FILE_DEVICE_DATALINK: | ||
break; | ||
case FILE_DEVICE_CONSOLE: | ||
case FILE_DEVICE_NULL: | ||
case FILE_DEVICE_KEYBOARD: | ||
case FILE_DEVICE_MODEM: | ||
case FILE_DEVICE_MOUSE: | ||
case FILE_DEVICE_PARALLEL_PORT: | ||
case FILE_DEVICE_PRINTER: | ||
case FILE_DEVICE_SCREEN: | ||
case FILE_DEVICE_SERIAL_PORT: | ||
case FILE_DEVICE_SOUND: | ||
/* \\.\nul */ | ||
result->st_mode = (result->st_mode & ~S_IFMT) | _S_IFCHR; | ||
break; | ||
case FILE_DEVICE_NAMED_PIPE: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my earlier comment, we shouldn't arrive in _Py_stat_basic_info_to_stat()
for device types that don't implement a filesystem or aren't mounted by a filesystem. That's because, in my experience, FileBasicInformation
(file attributes and timestamps) is only implemented for files and directories in a filesystem.
A filesystem type that mounts a volume will return FILE_DEVICE_DISK
(e.g. sample FAT32 source code) or FILE_DEVICE_CD_ROM
(e.g. sample CDFS source code). FILE_DEVICE_DFS
is the distributed filesystem that combines multiple shares. A RAM disk device can use FILE_DEVICE_VIRTUAL_DISK
, like the old RAM disk sample driver did, and the driver could implement a custom filesystem, but nowadays generally FILE_DEVICE_DISK
or FILE_DEVICE_CD_ROM
is used for a RAM disk (the ImDisk driver implements both).
FILE_DEVICE_DISK_FILE_SYSTEM
, FILE_DEVICE_CD_ROM_FILE_SYSTEM
, and FILE_DEVICE_NETWORK_FILE_SYSTEM
are the device types for the base filesystem device, e.g. "\\?\GlobalRoot\FAT", "\\?\GlobalRoot\UdfsCdRom", or "\\?\UNC". os.stat()
reports the first two as block devices (S_IFBLK
), but FILE_DEVICE_NETWORK_FILE_SYSTEM
isn't handled at all because GetFileType()
returns FILE_TYPE_UNKNOWN
. I don't think these low-level devices fit into the POSIX regime of S_IFCHR
vs S_IFBLK
, but they're more like S_IFCHR
since reads and writes don't use a fixed block size. It's nearly a moot point since there's no useful reason to care. There isn't much to be done with such devices in user-mode applications.
In principle, the stat info classes could be implemented for the named-pipe filesystem on the FILE_DEVICE_NAMED_PIPE
device type, but in my tests NPFS doesn't implement FileStatInformation
. I asked whether it's planned to implement stat info classes in NPFS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my earlier comment, we shouldn't arrive in _Py_stat_basic_info_to_stat() for device types that don't implement a filesystem or aren't mounted by a filesystem.
Ah, that's a very good point (which I missed). There were some tests that stopped failing with my change here though, so clearly something ends up in this path.
My switch basically mirrors GetFileType
, so it should be the same as our existing code.
Other than leading people the wrong way when they read our code, is there anything wrong with checking all the values? There's already a ton of stuff that doesn't map well into POSIX (all the error codes, for example), and I'd prefer our approximations at least be consistent with themselves, even if they aren't meaningful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than leading people the wrong way when they read our code, is there anything wrong with checking all the values?
For the stat info classes, they could decide to specially handle some devices in the I/O manager, or implement a fallback in GetFileInformationByName()
. I wouldn't want a device to be reported as a regular file instead of either S_IFCHR
or S_IFBLK
. It shouldn't be a problem as long as they only support the stat info classes for filesystem files and directories -- not for volume devices, filesystem devices, or controllers. (I don't think FILE_DEVICE_DATALINK
is used by anything.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll update the cases that should handle devices, though it does seem that the current builds don't return stat information by name for \\?\GlobalRoot\NTFS
at least, and it goes through the old function. That could always change though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they do end up supporting all devices for the stat info classes, it should be okay as long as they require that FileAttributes
is set to FILE_ATTRIBUTE_NORMAL
for a file that has no other attributes set. That's already required for FileBasicInformation
, according to [MS-FSA]. Devices would be distinguished by having FileAttributes
set to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, though I haven't analyzed it in detail. If there's any particular critique you're seeking from me, please let me know.
I'm guessing it picked up everyone whose code had a So no specific review requests unless someone has concerns about changing the I'll do a buildbot run, but if anyone has access to a Linux machine with |
FYI, I've not forgotten about this. I'm waiting to hear back from the Windows team on whether they can fix their new API to provide all the information That said, if anyone sees enough value in |
✅ Deploy Preview for python-cpython-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
I'm leaving this PR, it looks like it's in good hands. |
Superseded and now massively conflicted, so I'm just going to close this. Adding statx still seems like a good idea if someone wants to do it. |