-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory #58625
Comments
Overall, this looks good to me. A few minor questions:
Do we need this requirement? I don't think the GC will follow non-Go pointers, so I think this should be fine as long the application doesn't dereference the pointer when it is invalid. (I am imagining a library that temporarily unmaps (or more likely, protects PROT_NONE) pages during transient periods when they should not be accessed).
How do we implement this? Add an unconditional dereference of the pointer? Does a PROT_NONE-protected page count as "valid"? If so, a dereference isn't sufficient, you'd need to either try to map over the page, or parse /proc/self/maps. |
I think the Plus, if the non-Go allocator does actually unmap the address (instead of mapping it I would be ok with weakening the requirement to “must remain mapped”, but I'm not sure that I want to try to define “mapped” portably. 😅 |
Heavy emphasis on the “may”? I was thinking we would call |
At first glance updating vet for this means simply permitting all conversions from |
"This address must remain valid" seems both too strong and not strong enough. What actually matters is that the address is never allocated by the Go runtime. I'm not sure how to phrase that, though. |
I agree that's the simplest change to make to it. Part of the intent is to shift the burden of enforcement from a relatively noisy static check ( It is possible in theory for Also, the large number of |
Maybe: |
Tying these two things together worries me. Having a program be incorrect if it leaves behind a pointer it never dereferences to some memory region that was unmapped and remapped by the Go runtime feels too subtle to me. I'm also not sure how to implement a dynamic check for this case since there's no way the runtime could identify that a pointer just sitting there previously referred to non-Go memory. This is something of an insight from the arena proposal: it's really not safe to just unmap memory regions that Go pointers reference. The arena implementation waits until the GC can't find a Go pointer to the region before allowing that address space to be reused. One way we can make this OK is to support GC-managed memory regions, where the GC will let you know (via a finalizer or similar mechanism) that it's really OK to unmap a region. (Side note: this is actually fairly easy to implement today, if we require/guarantee that memory regions are aligned to 8 KiB, and are at least 8 KiB in size. Unfortunately our most popular platforms all have 4 KiB physical pages...) |
That's why it's
If the GC rate is high enough, eventually the pointer will be unallocated when the GC scans it and trip the
Agreed, but I expect that in the vast majority of cases, the addresses converted from On the other hand, to the extent that it is a problem under this proposal, it is already a problem under |
(CC @dominikh @timothy-king) |
Haha, yeah I suppose so. :)
That's probably true. There are a few different failure modes and outcomes possible in today's GC (unclear what it might look like under different GC implementations). There's just varying levels of unsafety here and since this is one of those things that relies on pointers existing, so it's especially hard to track down the issue from just the error you get. Maybe that's just what we're willing to say is what happens with
Also probably true. I guess at the very least I'd argue for some strongly worded advice pointing out that this is really the only case that's straightforward to reason about, until we have some way for the GC to notify code that some memory region is no longer referenced.
That's a good point. Though, I'd argue we should just offer a solution to that problem (independently of this proposal). Lastly, I'm paranoid that we're codifying a constraint on GC implementations with this. It's ensuring that the GC always has to be somewhat tolerant of what actually is in a pointer slot. The runtime definitely relies on this property internally in the current implementation, but how does it restrict future implementations? And maybe I'm wrong and we've already codified this property, making it a moot point? I don't know yet either way. |
FWIW, as far as I can tell this is exactly analogous to the cgo pointer-passing rules: “C code may not keep a copy of a Go pointer after the call returns.” It just happens to run in the other direction too: if a C-like program passes a pointer to a Go function, the Go code may not keep a copy of the C pointer either. Fair's fair, after all! 🙃
Between It may be that we could get by with requiring that the |
…ed again. The solution can be found here: golang/go#58625
If we say "uintptr to unsafe.Pointer is OK for non-Go pointers" then I think there's almost no way that vet could tell with any precision when to report an error. Any pointer being converted to uintptr and back might have come from C. The only clear case to flag would be: x := uintptr(unsafe.Pointer(&v)) but that doesn't seem very common. Is there something we can do to flag bad code in the vet check? Perhaps we should run the unsafeptr vet check across the whole open source ecosystem? |
Agreed — for
I think we should more-or-less give up on the More importantly: it should be straightforward to implement a much more precise |
This proposal has been added to the active column of the proposals project |
Fwiw, I worked around this in the past with a module that just has
Then in my main module I have Perhaps the go stdlib could provide something like |
Use the workaround suggested in golang/go#58625 This reverts commit 2a46726.
Another way to solve the issue would be some kind of address space or memory tagging that makes sure that Go can still be a memory safe language and more of such hacks to make it even less memory safe can be avoided. That would allow using normal structs again and track the data path in static verification tools and maybe dynamic runtime parts to support this. I am aware that this might lead to a counter proposal, but want to know if that line of thought has been considered as I was unable to find that in this discussion. |
@nightlyone, sorry, but I don't see how that's relevant to this proposal. Existing system calls and existing C functions may return pointers to parts of the address space outside of what is managed by the Go runtime. No change to the Go runtime or toolchain can change that fact. The |
I'm not convinced this should fail. Whether an address is valid is a dynamic property that can change over the lifetime of the address space, whereas the same Go pointer value could remain live across those changes. It seems like a semantic mismatch to force validation at the initial conversion to unsafe.Pointer, if we're not going to re-validate them on address space changes.
FWIW, I believe this is what checkptr does already, unless I misunderstand what you mean: https://github.com/golang/go/blob/master/src/runtime/checkptr.go#L50 |
Here are the results of running the unsafeptr check on the latest version of all public modules stored on proxy.golang.org that are required by >= 1 other modules (this filters out many forks). https://gist.github.com/rsc/d56cf40e010112f67339340101f92f17 A few observations:
Seeing these results, it seems to me that fixing the unsafeptr check to be precise is a hopeless cause. We simply cannot decide statically. However, among the many false positives, there are real problems unsafeptr turns up, like converting an integer off the network to a channel or even subtle misuses of reflect that store uintptr-typed pointers onto the stack where the GC will not see them. We want to preserve some way to find and report these. We have a way to do this today, namely -d=checkptr=1, which checks unsafe.Pointer(u) conversions and is enabled automatically in -race and -msan builds. Because it is a dynamic check, it can permit code to synthesize non-Go pointers but then block the exact same code and call stack from synthesizing a Go pointer. If we can make that unsafe.Pointer(u) check very cheap, say under 10X the cost of an array bounds check, then perhaps it would make sense to enable it always, not just with -d=checkptr=1. Based on discussion with @mknyszek, @mdempsky, @aclements, and others, it seems like we probably can make the check cheap enough to run in ordinary production use. I think we should explore that. Specifically, I suggest:
|
For this proposal issue, I think all we need to agree on is (1) and (2) in my previous comment, namely the additions to the spec's rules for unsafe. |
Based on the discussion above, this proposal seems like a likely accept. |
I'm not sure I understand this part, or at least how we're going to implement it. Or should we disallow conversion like |
I think we should do that. (I thought |
In general we can't implement it. We can implement the specific case of |
I agree that we shouldn't instrument |
If I understand correctly, this means that if Caveat: This isn't actually always safe today with cmd/compile. For example, compiling https://go.dev/play/p/SWa3KGxqKPk with
Can you elaborate what you mean that Note: An expression like
Do you have specific expectations of what |
No change in consensus, so accepted. 🎉 |
This ended up in the compiler/runtime triage queue because the proposal was accepted, but we have some questions about what exactly the actions here are. I think we're clear on what has to happen for (1), but we're less clear on the requirements for the rest (i.e. how is the compiler implementation restricted, and what should |
SGTM thanks.
By "not OK" here I just meant it's an invalid program. I did not mean that the compiler has to catch it or enable the runtime to catch it. We could imagine disallowing the (*[1<<63-1]byte)(unsafe.Pointer(1)) because it spans the Go address space, but I don't think that's terribly useful since you can still do
This program is invalid but I don't think we should instrument unsafe.Add to look for transitions from "not Go" argument to "Go" result. unsafe.Add should probably still stay a simple add instruction. We can't actually catch all misuse of unsafe. If someone is determined, they will find a way to write something that defeats checkptr. For example the checkptr implementation is not going to catch you doing:
which probably works most of the time. :-)
Understood. It's not an issue because it's not a valid program.
The main thing is to catch honest mistakes, like:
or similar code with struct offsets added to u, say. (The mistake is that u has type uintptr so it's not going to be updated if p moves or stop p from being GC'ed.) Whatever can be done without false positives and at extremely low overhead sounds good. -d=checkptr=1 can still do more if there's something important to keep doing with more overhead. The low overhead one would be enabled for -d=checkptr=0. |
Since upstream is unwilling to drop mmapper or desensitize unsafeptr, I decided to take matters into my own hands by switching to even more "unsafe" conversions to get rid of the useless warning. Kind of ironic, isn't it? golang/go#58625
Background
The documentation for
unsafe.Pointer
currently lists six valid conversion patterns:“Conversion of a
*T1
toPointer
to*T2
… [p]rovided thatT2
is no larger thanT1
and that the two share an equivalent memory layout.”“Conversion of a
Pointer
to auintptr
(but not back toPointer
).”“Conversion of a
Pointer
to auintptr
and back, with arithmetic … in the same expression, with only the intervening arithmetic between them.”“Conversion of a
Pointer
to auintptr
when callingsyscall.Syscall
… [or] in the argument list of a call to a function implemented in assembly.” (Compare proposal: unsafe: clarify unsafe.Pointer rules for package syscall #34684.)“Conversion of the result of
reflect.Value.Pointer
orreflect.Value.UnsafeAddr
fromuintptr
toPointer
… immediately after making the call, in the same expression.”“Conversion of a
reflect.SliceHeader
orreflect.StringHeader
Data
field to or fromPointer
… when interpreting the content of an actual slice or string value.”The
unsafeptr
check provided bycmd/vet
warns about uses that do not follow the above patterns. However, it currently flags many violations inx/sys/unix
(see #41205) when addresses returned by system calls such asmmap
are converted to Go pointers, and in code generated by ebitengine/purego (see #56487) when hard-coded addresses provided by the operating system or linker are converted to Go pointers. In both cases, the program is attempting to create Go pointers that refer to known-valid addresses that are not managed by the Go runtime; notably, the compiler's-d=checkptr
mode does not flag them as invalid at run-time.Ideally, the
unsafe.Pointer
documentation, theunsafeptr
check incmd/vet
, the-d=checkptr
mode incmd/compile
, and the real-world usage insyscall
,x/sys
, and similar low-level libraries should all agree on what is valid. This proposal aims to narrow that gap.Proposal
I propose that we add another allowed case in the
unsafe.Pointer
documentation:The
unsafeptr
check incmd/vet
would be changed to allow the new case, resolving the warnings inx/sys
andebitengine/purego
.The compiler's
-d=checkptr
mode would check conversions fromuintptr
tounsafe.Pointer
. The conversion may throw at run-time if:uintptr
refers to an address managed by Go that is not explicitly pinned (runtime: provide Pinner API for object pinning #46787), ortheuintptr
refers to an invalid (unmapped) nonzero address.[edit: per unsafe: allow conversion of uintptr to unsafe.Pointer when it points to non-Go memory #58625 (comment), unmapped addresses should be allowed as long as they cannot become Go addresses]
Alternatives
The
vet
warning can be worked around today by relying on a liberal reading of the “equivalent memory layout” rule, rewriting(https://go.dev/play/p/ZWZxv6URqTW) as
(https://go.dev/play/p/Wh5f8k0_yyR), on the theory that the memory layout of a
uintptr
must be in some sense “equivalent” to the memory layout ofunsafe.Pointer
. However, I believe that such a rewrite does not capture the intent of the code as accurately, and should not be necessary.(CC @golang/runtime)
The text was updated successfully, but these errors were encountered: