Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The Curious Case Of HANDLE and SOCKET #1643

Closed
Gankra opened this issue Mar 29, 2022 · 21 comments
Closed

The Curious Case Of HANDLE and SOCKET #1643

Gankra opened this issue Mar 29, 2022 · 21 comments
Labels
question Further information is requested

Comments

@Gankra
Copy link

Gankra commented Mar 29, 2022

Here is a list of type definitions from many different sources:

wintnt.h:

typedef void *HANDLE;
...
typedef PVOID HANDLE;

WinSock.h / WinSock2.h

typedef UINT_PTR        SOCKET;

windows_sys::Win32::Foundation::HANDLE
windows_sys::Win32::Networking::WinSock::SOCKET

pub type HANDLE = isize;
pub type SOCKET = usize;

rust\library\std\src\sys\windows\c.rs:
rust\library\std\src\os\windows\c.rs:

pub type HANDLE = LPVOID;
pub type SOCKET = u64;

std::os::windows::raw::HANDLE:
std::os::windows::raw::SOCKET:

pub type HANDLE = *mut c_void;
pub type SOCKET = u64;

Windows Socket Handles Docs

(I'm assuming this is talking about SOCKET, Microsoft's docs really hate linking these types.)

A socket handle can optionally be a file handle in Windows Sockets 2. A socket handle from a Winsock provider can be used with other non-Winsock functions such as ReadFile, WriteFile, ReadFileEx, and WriteFileEx.


So as part of strict_provenance I have been trying to make code more consistent / honest about values which are "pointers" vs "integers". These APIs are the bane of my existence.

  • Everyone but windows_sys agrees a HANDLE is a pointer
  • Everyone agrees SOCKET is an integer
  • Micosoft's APIs then declare / require that it's ok to "pun" a SOCKET as a HANDLE

This made me very confused, and every time I checked a definition it was Slightly Different.

As far as I can tell both of these types are actually always integers (the kernel can't just hand userspace a pointer and then expect it to be a useful pointer if it's ever passed back to it, so it "must" be an integer and if you ask anyone with knowledge of the kernel they will say as much). Presumably HANDLE is just so ancient that Microsoft handn't figured out the Magic of UINT_PTR yet.

So windows_sys is, in some sense, the one being Honest here. If these types are in fact always integers this mess is actually fine because under strict provenance it's "always ok for an integer to pretend to be a pointer For Fun". It's just, very hard to tell that this is what's Happening based on the surface definitions and types.

@kennykerr kennykerr added the question Further information is requested label Mar 29, 2022
@riverar
Copy link
Collaborator

riverar commented Mar 29, 2022

Hi @Gankra, apologies in advance. Are you reporting an issue or have a question? I think I missed it. Happy to help if you do.

@Gankra
Copy link
Author

Gankra commented Mar 29, 2022

Oh sorry, I was told to file this by @yoshuawuyts, I assumed he would be the one to "get" the issue!

@riverar
Copy link
Collaborator

riverar commented Mar 29, 2022

Ok! Will defer to @yoshuawuyts, but without an issue or question, it's a bit difficult to justify keeping this open.

In the meantime, I read over this thread https://twitter.com/Gankra_/status/1508586681762566147. And I hear ya. We are working on HANDLE types, tweaking how they are represented in the upstream metadata, and working on the addition of metadata to make downstream do smarter things (e.g. know how to automatically generate the implementation of handle.is_invalid()).

Historically, a HANDLE was a tiny (word-sized) unsigned integer (in Windows 1.0) that various Windows APIs would use to represent, well, things, without giving access to backing tables or infrastructure. A handle could be an index into some internal table or array, be an identifier for an output device, or hold a value that just so happens to also be a valid pointer (unbeknownst to you!) to some data structure stashed away in a local or global heap. (Shhh, don't tell anyone!)

As Windows evolved over the years, so did developer needs to represent more things. And bigger things. So as you can imagine, some of the fundamental data types changed. A HANDLE today now represents a nebulous pointer to something, with its size (on Windows) aligned to the count of bytes needed to represent an address to something. This was a great 16/32/64-bit Windows compatibility win, of course, but it also ensured handles can continue to hold values that, err, point to things. (Shhh, don't tell anyone!)

So with that understanding of handles, socket handles are just a specialization of a base handle, with socket programming-related semantics/constraints.

Does that help you any?

@nico-abram
Copy link

From a brief skim of types in https://docs.microsoft.com/en-us/windows/win32/winprog/windows-data-types that typedef to HANDLE, I found 2 that seem to actually be publically documented to be pointers:

  1. HINSTANCE/HMODULE "A handle to an instance. This is the base address of the module in memory."
  2. HGLOBAL , return value of GlobalAlloc https://docs.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-globalalloc

@rylev
Copy link
Contributor

rylev commented Mar 29, 2022

I think the conclusion here is that the metadata project (and thus windows-rs by extension) are likely as right as one can be and isize is the appropriate representation for HANDLE. I think this means this issue can be closed since there's likely nothing the windows-rs project can or should do in this case. But if @Gankra or anyone else disagrees we can always reopen this.

@rylev rylev closed this as completed Mar 29, 2022
@Gankra
Copy link
Author

Gankra commented Mar 29, 2022

@nico-abram's comment suggests that void* is in fact the right choice, no? Basically it's fine to pretend integers are pointers, but it's semantically problematic to pretend actual pointers that user code actually wants to dereference is an integer, because then provenance falls off and the compiler can start making incorrect conclusions about aliasing.

This is not a "problem" if the pointer is something from the kernel that is opaque to us, but it is a problem if user code gets to understand it's a pointer and actually deref it.

@Diggsey
Copy link

Diggsey commented Mar 29, 2022

Yes, there is definitely code out there which uses an HINSTANCE as a pointer.

@riverar
Copy link
Collaborator

riverar commented Mar 29, 2022

@nico-abram's comment suggests that void* is in fact the right choice, no? [...]

Non-specialized handles are currently void* in the headers; IntPtr in metadata; isize in bindings, which is mostly consistent. There is no one correct type for handles, for the reasons I stated above.

@thomcc
Copy link

thomcc commented Mar 29, 2022

If it may hold a pointer, it must be a pointer. It's okay if it's sometimes an integer, there's no rule against representing arbitrary integers as pointers (12341234 as *const () is safe after all)

@Gankra
Copy link
Author

Gankra commented Mar 29, 2022

Yeah basically if a type is *mut | int, then the correct union of these types is *mut, not int.

@kennykerr
Copy link
Collaborator

The Windows metadata - which as Ryan pointed out is what drives windows-rs - distinguishes between handles that are opaque and explicitly not dereferenceable, and those that may just be pointers. If you find there is a handle type that has the wrong underlying type, feel free to open an issue on the metadata repo: https://github.com/microsoft/win32metadata

@riverar
Copy link
Collaborator

riverar commented Mar 29, 2022

If it may hold a pointer, it must be a pointer.

I'm not aware of any cases where a returned handle is designed to be used as a pointer by the developer. The developer isn't meant to know these handle integers are pointers. As is the case with most abstractions, I recognize they leak sometimes. 🙃

@kennykerr
Copy link
Collaborator

Also keep in mind that Windows has many different handle types. The discussion here and in #1622 (rust-lang/rust#87074) seem to imply that Windows only has two handle types, which is far from the truth.

@Gankra
Copy link
Author

Gankra commented Mar 29, 2022

Just to be clear: It is also a problem if the user is expected to take a pointer and cast it to an (integer) HANDLE that the kernel or whatever else will proceed to read/write through. This is because under the strictest interpretations of provenance the compiler will not view this as "exposing" the pointer and think it's unaliased and miscompile your code.

(This is indeed a very strict model, but we are hoping to encourage developers to follow it whenever they can, because it means their code will trivially work under any memory model, and it is Very Simple to understand and validate.)

@riverar
Copy link
Collaborator

riverar commented Mar 29, 2022

Just to be clear: It is also a problem if the user is expected to take a pointer and cast it to an (integer) HANDLE that the kernel or whatever else will proceed to read/write through.

(emphasis mine) Don't think this is the case for any Windows APIs. If you have one in mind, please do share.

@Gankra
Copy link
Author

Gankra commented Mar 29, 2022

If that indeed never happens, then I am totally happy! I am just not very familiar with this type hierarchy and am getting thrown off by all the different things that "are" "handles". Thanks for your help!

@Diggsey
Copy link

Diggsey commented Mar 29, 2022

I'm still trying to find it, so apologies that I can't be more specific, but I'm sure I've seen MFC code before which directly casts a pointer (to some kind of static resource) into a corresponding handle type via a C macro. Maybe I am just misremembering...

@riverar
Copy link
Collaborator

riverar commented Mar 29, 2022

I'm still trying to find it, so apologies that I can't be more specific, but I'm sure I've seen MFC code before which directly casts a pointer (to some kind of static resource) into a corresponding handle type via a C macro. Maybe I am just misremembering...

One commonly abused handle (and a great example of a handle implementation changing) is the module handle returned by GetModuleHandle. In Windows 1.0, this was a non-pointer integer used as an index into a handle table. This got optimized out and the base address got used instead. Developers were supposed to be none the wiser but, well, you know how that story goes. So now instead of calling GetModuleInformation to get the base address, folks shortcut and reinterpret the handle:

auto baseAddress = reinterpret_cast<std::uintptr_t*>GetModuleHandle("foo");
// Do something with this pointer, like scribble in process memory 🎉💀

@scottmcm
Copy link

If the pointer-ness or integer-ness isn't supposed to ever be used, I wonder if it would make sense for raw::HANDLE to also be a repr(transparent) newtype like BorrowedHandle is?

Obviously that wouldn't stop people from mem::transmuteing them the same way they reinterpret_cast them in C++, but it'd at least encourage using them opaquely even in raw calls.

@Gankra
Copy link
Author

Gankra commented Mar 30, 2022

I filed rust-lang/rust#95490 upstream(?) for discussing if Rust should do anything to match windows-sys here.

@MarijnS95
Copy link
Contributor

For anyone subscribed to this thread, it looks like the types are changed back to pointers recently:

microsoft/win32metadata#1924
microsoft/win32metadata@b4dfd2f

Looks like this on a regen: 96b9a27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

9 participants