Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deallocating the storage of a value without running its destructor is UB #693

Closed
wants to merge 1 commit into from

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Oct 15, 2019

This is required for Pin to be sound, but I couldn't find it documented anywhere.

cc @RalfJung @Centril

@RalfJung
Copy link
Member

Nope, this is not UB. With this clause, mem::forget would be unsound.

It is also not correct that this is required for Pin. Pin imposes a library contract that memory pointed to by a pinned pointer must be dropped before being deallocated. But this requirement is specific to pinned pointers and does not extend to other pointers. It is also not something the compiler or the Abstract Machine needs to know about.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 16, 2019

Makes sense. So to check that IIUC, deallocating a stack frame containing a Pin is not UB, it just violates the library contract of Pin - and what would be UB would be if, due to this contract violation, some other thread ends up doing a read-after-free or similar, right ?

@gnzlbg gnzlbg closed this Oct 16, 2019
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 16, 2019

@RalfJung one way in which mem::forget is special, is that mem::forget owns the memory it deallocates without calling destructors.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 16, 2019

OTOH a longjmp deallocates memory it does not own without calling destructors, that's UB, but I think this kind of UB is already guaranteed, since in a sense this violates ownership, right?

@RalfJung
Copy link
Member

Makes sense. So to check that IIUC, deallocating a stack frame containing a Pin is not UB, it just violates the library contract of Pin - and what would be UB would be if, due to this contract violation, some other thread ends up doing a read-after-free or similar, right ?

Yes.

OTOH a longjmp deallocates memory it does not own without calling destructors, that's UB, but I think this kind of UB is already guaranteed, since in a sense this violates ownership, right?

There is potential UB here from use-after-free if e.g. another thread had pointers to this stack frame. I don't think we need a new error condition in the Abstract Machine for this.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 16, 2019

I don't think we need a new error condition in the Abstract Machine for this.

No, I don't think so either. Do you know if miri has a model for unwinding?

My mental model is that the Rust stack is a Vec<StackFrame>, where each StackFrame is a enum with different fields representing the different states of this frame, and this enum implements Drop. Dropping a stack frame, drops its fields. The current thread of execution owns its own vector of stack frames, and advances its states as execution proceeds. When unwinding occurs, it just checks whether the current stack frame stops the unwind, and if so, advances it to the state in which this happens. If the current frame does not stop unwinding, it just drops the StackFrame, and that drops the appropriate fields.

What a setjmp then does, is save a StackFrame at some state somewhere. Then when a longjmp happens, the frames up to and including the one with the setjmp are mem::forgetted, and the previously saved frame is repushed.

There are some abstractions, like Pin, Thread, etc. that allow passing references to some variable in the stack frame to, e.g., a different thread of execution. These abstractions are safe in the presence of unwinding only if destructors are run, because they use the destructor to make sure that those references do not dangle when the StackFrame is dropped.

Since longjmp mem::forgets the StackFrame, it breaks the safety invariant of these abstractions., and that's why longjmp is unsafe.

Does something like this sounds about right?

@RalfJung
Copy link
Member

Do you know if miri has a model for unwinding?

It doesn't. There are some PRs though that are waiting to be reviewed by me. It's just, some people keep bringing up so much stuff in the UCG, I just don't have time for code review any more.^^

What a setjmp then does, is save a StackFrame at some state somewhere. Then when a longjmp happens, the frames up to and including the one with the setjmp are mem::forgetted, and the previously saved frame is repushed.

Doesn't setjmp just remember the stack heigh at the time it is called, and longjmp is the same as Vec::set_len.

Does something like this sounds about right?

Mostly. One detail is that really every local variable is its own "allocated object" (we don't allow ptr arithmetic between them).

You can incorporate that into your model by saying that StackFrame consists of pointers to the local variables. These are Box pointers, so the drop story still works out the same.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 16, 2019

Doesn't setjmp just remember the stack heigh at the time it is called, and longjmp is the same as Vec::set_len.

In most platforms, yes. In some, it is implemented with unwinding (catch_unwind + panic).

@RalfJung
Copy link
Member

I am talking about the abstract spec, not the implementation.

Why the duplication of the entire frame? That's observably different if there are pointers to other parts of this frame that get mutated between setjmp and longjmp.

(Also I feel a closed incorrect issue is the wrong place to discuss this. Could you find a better place?)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 17, 2019

(Also I feel a closed incorrect issue is the wrong place to discuss this. Could you find a better place?)

I think it might be better to discuss this in the miri repo, as part of how to implement unwinding there maybe ?

Why the duplication of the entire frame? That's observably different if there are pointers to other parts of this frame that get mutated between setjmp and longjmp.

I was only talking here about the case where, if such pointers exist, they are still safe to use afterwards - only the allocations in the stack for which such pointers do not exist and for which safety invariants are not violated would be "ok" to deallocate.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 17, 2019

That is, there are certain parts of those frames that's ok to just leak, and parts that aren't, so maybe the realization is that it does not make sense to consider longjmp to work "at the frame level", but to look at it as deallocating each of the allocations within a frame, and only if all the allocations are deallocated, then the frame is deallocated, and longjmp move to the next frame.

@RalfJung
Copy link
Member

RalfJung commented Oct 17, 2019

I think it might be better to discuss this in the miri repo, as part of how to implement unwinding there maybe ?

I don't see how your model is related to that. Unwinding is being implemented at rust-lang/rust#60026 and rust-lang/miri#693. But the details are far removed from what we discussed here -- there's no question here about what happens when you unwind, right? The operational behavior implemented by Rust is clear. You were just describing a more abstract mental model.

That is, there are certain parts of those frames that's ok to just leak, and parts that aren't, so maybe the realization is that it does not make sense to consider longjmp to work "at the frame level", but to look at it as deallocating each of the allocations within a frame, and only if all the allocations are deallocated, then the frame is deallocated, and longjmp move to the next frame.

Sure. I don't see any fundamental difference between the two. A stack frame in Miri has a list of locals that longjmp would all just kill (deallocate) without running any drop. This is code that Miri already has to implement popping a stack frame:

https://github.com/rust-lang/rust/blob/a16dca337de610986252bb800953e57bf395863f/src/librustc_mir/interpret/eval_context.rs#L580-L582

Notice that while unwinding is in the process of being implemented, that's just the normal panic-induced unwinding (that's what I assumed you referred to). This just means running cleanup blocks. There's nothing like SJLJ in Miri.

I now feel like you might have said "unwind" to refer to SJLJ, which is confusing me a lot.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 17, 2019

I now feel like you might have said "unwind" to refer to SJLJ, which is confusing me a lot.

Yes, I was saying "unwind" in the context of SJLJ.

Notice that while unwinding is in the process of being implemented, that's just the normal panic-induced unwinding (that's what I assumed you referred to). This just means running cleanup blocks. There's nothing like SJLJ in Miri.

What I was discussing is a more general framework for talking about unwinding, where the only difference between "panic" unwinding and SJLJ unwinding is that when "panic" unwinding calls drop on a stack frame field, SJLJ just calls forget instead.

@RalfJung
Copy link
Member

In Miri terms, that's just popping a stack frame without running any of the cleanup or unwind code.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 17, 2019

Notice that, opposed to catch_unwind, a C setjmp at least doesn't take a closure, so the frame where the setjmp happens is only partially unwound. A Rust API for this might look differently though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants