Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-99726: Adds os.statx function and associated constants #99755

Closed
wants to merge 30 commits into from

Conversation

zooba
Copy link
Member

@zooba zooba commented Nov 24, 2022

Doc/library/os.rst Outdated Show resolved Hide resolved
@zooba
Copy link
Member Author

zooba commented Nov 25, 2022

My TODO list:

  • Verify whether DeviceType values are devices or types
  • Finish converting stdlib uses of stat()
  • Reread documentation and make sure it's up to date
  • Maybe test whether a totally separate struct would be simpler enough to justify non-interchangeable results
  • More tests

Comment on lines 1149 to 1171
case FILE_DEVICE_DISK:
case FILE_DEVICE_DISK_FILE_SYSTEM:
case FILE_DEVICE_VIRTUAL_DISK:
case FILE_DEVICE_DFS:
case FILE_DEVICE_CD_ROM:
case FILE_DEVICE_CD_ROM_FILE_SYSTEM:
case FILE_DEVICE_CONTROLLER:
case FILE_DEVICE_DATALINK:
break;
case FILE_DEVICE_CONSOLE:
case FILE_DEVICE_NULL:
case FILE_DEVICE_KEYBOARD:
case FILE_DEVICE_MODEM:
case FILE_DEVICE_MOUSE:
case FILE_DEVICE_PARALLEL_PORT:
case FILE_DEVICE_PRINTER:
case FILE_DEVICE_SCREEN:
case FILE_DEVICE_SERIAL_PORT:
case FILE_DEVICE_SOUND:
/* \\.\nul */
result->st_mode = (result->st_mode & ~S_IFMT) | _S_IFCHR;
break;
case FILE_DEVICE_NAMED_PIPE:
Copy link
Contributor

@eryksun eryksun Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my earlier comment, we shouldn't arrive in _Py_stat_basic_info_to_stat() for device types that don't implement a filesystem or aren't mounted by a filesystem. That's because, in my experience, FileBasicInformation (file attributes and timestamps) is only implemented for files and directories in a filesystem.

A filesystem type that mounts a volume will return FILE_DEVICE_DISK (e.g. sample FAT32 source code) or FILE_DEVICE_CD_ROM (e.g. sample CDFS source code). FILE_DEVICE_DFS is the distributed filesystem that combines multiple shares. A RAM disk device can use FILE_DEVICE_VIRTUAL_DISK, like the old RAM disk sample driver did, and the driver could implement a custom filesystem, but nowadays generally FILE_DEVICE_DISK or FILE_DEVICE_CD_ROM is used for a RAM disk (the ImDisk driver implements both).

FILE_DEVICE_DISK_FILE_SYSTEM, FILE_DEVICE_CD_ROM_FILE_SYSTEM, and FILE_DEVICE_NETWORK_FILE_SYSTEM are the device types for the base filesystem device, e.g. "\\?\GlobalRoot\FAT", "\\?\GlobalRoot\UdfsCdRom", or "\\?\UNC". os.stat() reports the first two as block devices (S_IFBLK), but FILE_DEVICE_NETWORK_FILE_SYSTEM isn't handled at all because GetFileType() returns FILE_TYPE_UNKNOWN. I don't think these low-level devices fit into the POSIX regime of S_IFCHR vs S_IFBLK, but they're more like S_IFCHR since reads and writes don't use a fixed block size. It's nearly a moot point since there's no useful reason to care. There isn't much to be done with such devices in user-mode applications.

In principle, the stat info classes could be implemented for the named-pipe filesystem on the FILE_DEVICE_NAMED_PIPE device type, but in my tests NPFS doesn't implement FileStatInformation. I asked whether it's planned to implement stat info classes in NPFS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my earlier comment, we shouldn't arrive in _Py_stat_basic_info_to_stat() for device types that don't implement a filesystem or aren't mounted by a filesystem.

Ah, that's a very good point (which I missed). There were some tests that stopped failing with my change here though, so clearly something ends up in this path.

My switch basically mirrors GetFileType, so it should be the same as our existing code.

Other than leading people the wrong way when they read our code, is there anything wrong with checking all the values? There's already a ton of stuff that doesn't map well into POSIX (all the error codes, for example), and I'd prefer our approximations at least be consistent with themselves, even if they aren't meaningful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than leading people the wrong way when they read our code, is there anything wrong with checking all the values?

For the stat info classes, they could decide to specially handle some devices in the I/O manager, or implement a fallback in GetFileInformationByName(). I wouldn't want a device to be reported as a regular file instead of either S_IFCHR or S_IFBLK. It shouldn't be a problem as long as they only support the stat info classes for filesystem files and directories -- not for volume devices, filesystem devices, or controllers. (I don't think FILE_DEVICE_DATALINK is used by anything.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update the cases that should handle devices, though it does seem that the current builds don't return stat information by name for \\?\GlobalRoot\NTFS at least, and it goes through the old function. That could always change though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they do end up supporting all devices for the stat info classes, it should be okay as long as they require that FileAttributes is set to FILE_ATTRIBUTE_NORMAL for a file that has no other attributes set. That's already required for FileBasicInformation, according to [MS-FSA]. Devices would be distinguished by having FileAttributes set to 0.

@zooba zooba marked this pull request as ready for review November 29, 2022 00:05
@zooba zooba requested review from brettcannon, a team, jaraco, warsaw and vsajip as code owners November 29, 2022 00:05
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Doc/library/os.rst Outdated Show resolved Hide resolved
Copy link
Member

@jaraco jaraco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, though I haven't analyzed it in detail. If there's any particular critique you're seeking from me, please let me know.

@zooba
Copy link
Member Author

zooba commented Nov 30, 2022

I'm guessing it picked up everyone whose code had a stat in it that I decided to change (a few made more sense to leave alone).

So no specific review requests unless someone has concerns about changing the stat_result struct (I'm basically convinced it's better to modify the existing struct than to create a new one - samestat(statx(), stat()) ought to work).

I'll do a buildbot run, but if anyone has access to a Linux machine with statx and wants to try it out, I'd also appreciate that. My Ubuntu 22 images have it, but don't seem to behave any different.

@zooba zooba added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 30, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @zooba for commit bbcc6ee 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 30, 2022
@brettcannon brettcannon removed their request for review December 9, 2022 23:38
@zooba
Copy link
Member Author

zooba commented Dec 12, 2022

FYI, I've not forgotten about this. I'm waiting to hear back from the Windows team on whether they can fix their new API to provide all the information stat() wants. If they do, I can simply switch to that API directly rather than adding a new one.

That said, if anyone sees enough value in statx generally, I'm happy to have this merged. We might have to tweak the Windows side of things if they change it, but I'm sure they'll find a non-breaking way to make the change.

@netlify
Copy link

netlify bot commented Dec 15, 2022

Deploy Preview for python-cpython-preview ready!

Name Link
🔨 Latest commit cadeecb
🔍 Latest deploy log https://app.netlify.com/sites/python-cpython-preview/deploys/639b3ffff08a180008b9bd22
😎 Deploy Preview https://deploy-preview-99755--python-cpython-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

@gvanrossum
Copy link
Member

I'm leaving this PR, it looks like it's in good hands.

@zooba
Copy link
Member Author

zooba commented Mar 16, 2023

Superseded and now massively conflicted, so I'm just going to close this. Adding statx still seems like a good idea if someone wants to do it.

@zooba zooba closed this Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants