Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop impl of Vec is not idempotent and this is not documented #60822

Closed
gnzlbg opened this issue May 14, 2019 · 33 comments · Fixed by #67559
Closed

Drop impl of Vec is not idempotent and this is not documented #60822

gnzlbg opened this issue May 14, 2019 · 33 comments · Fixed by #67559
Labels
A-collections Area: `std::collection` A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented May 14, 2019

#60611 shows that the Drop impl of Vec is not idempotent. That is, if Drop fails, the Vec is left in an "un-droppable" state, and trying to re-drop the vector invokes undefined behavior - that is, the vector must be leaked.

It might be possible to make it idempotent without adding significant overhead [0], but I don't know whether we should do this. I think we should be clearer about whether the Drop impl of a type is idempotent or not, since making the wrong assumption can lead to UB, so I believe we should document this somewhere.

We could document this for the Vec type, but maybe this can also be something that can be documented at the top of the libcore/liballoc/libstd crates for all types (e.g. Drop impls of standard types are not idempotent).


[0] Maybe setting the vector len field to zero before dropping the slice elements and having the Drop impl of RawVec set the capacity to zero before exiting is enough to avoid trying to re-drop the elements or trying to deallocate the memory (which is always freed on the first drop attempt).

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 14, 2019

cc @rust-lang/libs

@jonas-schievink jonas-schievink added A-collections Area: `std::collection` A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. C-enhancement Category: An issue proposing an enhancement or a PR with one. labels May 14, 2019
@hanna-kruppe
Copy link
Contributor

It might be possible to make it idempotent without adding significant overhead [0], but I don't know whether we should do this. I think we should be clearer about whether the Drop impl of a type is idempotent or not, since making the wrong assumption can lead to UB, so I believe we should document this somewhere.

"It's not idempotent" should be the safe default assumption, and I know of no case so far where we've documented that a drop glue is idempotent. We might want to mention the general principle in suitable places (e.g., nomicon, UCG) but I don't think we should repeat "as usual, double-dropping this type causes UB" again on all specific types. We should only document the exceptions, if any.

(I am also quite unsure whether going through the effort of guaranteeing that certain types' drop glue is idempotent is a good use of time, as it seems extremely niche and I don't know any use cases for that. But that just means I personally won't invest time in that discussion.)

@SimonSapin
Copy link
Contributor

I think this is not an issue wit Vec. Rather, any unsafe code needs to be careful about panic safety, especially if it is also generic.

FWIW I’ve always assumed that double drop could be as much a memory safety issue as use-after-free. (“One and a half” drop even more so.) I don’t think Drop::drop idempotency should be expected for any type, but that sounds like a decision for @rust-lang/wg-unsafe-code-guidelines more than @rust-lang/libs.

@RalfJung
Copy link
Member

"It's not idempotent" should be the safe default assumption, and I know of no case so far where we've documented that a drop glue is idempotent. We might want to mention the general principle in suitable places (e.g., nomicon, UCG) but I don't think we should repeat "as usual, double-dropping this type causes UB" again on all specific types. We should only document the exceptions, if any.

Fully agreed.

The best you can hope for, IMO, is that a type will make an effort to drop as much as possible even when there's a panic during dropping. Vec actually does that by using the slice drop glue, which, if dropping one element panics, will keep dropping the other elements. This seems more useful than allowing idempotent dropping (it minimizes leakage even without any catch_unwind being involved), and it makes idempotent dropping unnecessary (if you catch a panic, the involved types already did everything they can to drop as much as possible, so there's no point in calling drop again).

I think it makes more sense to improve panic-resiliance of our drop impls than to make them idempotent. For example, AFAIK VecDeque drops two Vec's; if dropping the first panics, the second one will be leaked. This could be improved.

One interesting question I see here is the interaction with the pinning drop guarantee. Does Vec deallocate the backing store if dropping one of the elements panicked? If yes, is that a violation of the drop guarantee (assuming we extended Vec with pinning projections -- fn pin_get(Pin<&mut Vec<T>>, idx: usize) -> Option<Pin<&mut T>>)?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 14, 2019

Today double-drops are not a form of undefined behavior. They could, however, lead to undefined behavior depending on the types involved and how Drop is implemented for those types.

As part of the UCGs one could try to make double-drops UB per se. That would mean that unsafe code needs to make sure that double-drops don't happen, period. That might be a breaking change. If we don't do that, then we have some types for which double-drops are ok, and some types for which they are not, and the only way to tell is either by reading the documentation, or inspecting the type's source code.

I'd rather not recommend people to rely on what the source code does. It suffices that one crate starts relying on some Drop impl being "accidentally idempotent" in libstd today, for us to not be able to evolve that impl in the future without breaking code.

A "Unless stated otherwise, Drop impls of standard library types are not guaranteed to be idempotent, and if they are, we reserve the right to change that without maintaining backwards compatibility" note somewhere might be enough to prevent that.


The best you can hope for, IMO, is that a type will make an effort to drop as much as possible even when there's a panic during dropping. Vec actually does that by using the slice drop glue, which, if dropping one element panics, will keep dropping the other elements.

I expect that to happen in an idempotent Drop impl for Vec as well. Right now, the issue is that all elements that can be dropped are dropped, the backing allocation of the Vec is deallocated, but the len of the Vec is not set to zero, so a double drop will try to double drop the elements again (and then the drop impl of RawVec will be invoked, whose capacity has not been set to zero, which will try to double-free memory).

Does Vec deallocate the backing store if dropping one of the elements panicked?

Yes. When drop panics, the destructors of the fields are invoked. For Vec this means that the contained RawVec is dropped, which deallocates the backing storage of the Vec without setting the capacity of the Vec to zero.

If yes, is that a violation of the drop guarantee (assuming we extended Vec with pinning projections -- fn pin_get(Pin<&mut Vec>, idx: usize) -> Option<Pin<&mut T>>)?

AFAICT, no. If you have a Vec<Pin<Box<T>>> the only thing that will be deallocated without running destructors when that Vec is dropped is the storage of the Pin<Box<T>>. Since the destructor of the Pin<Box<T>> does not run, the destructor of the Box<T> does not run, and the backing allocation of the Box<T> gets leaked, which is fine (some other thread could still write to it).

@SimonSapin
Copy link
Contributor

https://rust-lang.github.io/rfcs/0320-nonzeroing-dynamic-drop.html (both the proposal that is now implemented an the description of the status quo before that) is relevant to the efforts that the language makes to not let double-Drop happen in safe Rust. This guarantee has existed for years, since before 1.0, so writers of unsafe code rely on it.

Although calling Drop::drop twice is not inherently UB in the Rust language, many unsafe libraries are written with the assumption that it never happens. Therefore, writers of generic unsafe code should assume that double drop of a value of a type parameter can cause UB.

Calling Vec::drop twice with the Unix system allocator causes a double free, which is UB. But Vec is only an example.

@RalfJung
Copy link
Member

I'd rather not recommend people to rely on what the source code does. It suffices that one crate starts relying on some Drop impl being "accidentally idempotent" in libstd today, for us to not be able to evolve that impl in the future without breaking code.

Drop is not the only operation that has such issues; pretty much any function you call can be UB or not under certain circumstances depending on implementation details. One example is e.g. relying on Vec::push not to reallocate -- if we did a reserve before, is that guaranteed? What if we did reserve and then pop? And so on.

If we really consider such details to be stable just because they are observable by "this program was not UB but now it is", we'd have to bake an abstract model of all of these data structures into our operational semantics, just to make some more code UB. I think that is a bad approach. UB is for enabling compiler optimizations. It is not for catching clients that exploit unstable details of the current implementation of some library. Conflating these two problems makes the definition of UB much, much more complicated, I think that's a mistake.

Just because some interaction with Vec is not UB currently, doesn't mean we are not allowed to ever change it to be UB. There is no implicit stabilization of the full set of UB-free interactions. I agree catching such "overfit" clients is an important problem, but let's keep that library-level discussion separate from the language-level discussion of what is and is not UB.


I expect that to happen in an idempotent Drop impl for Vec as well. Right now, the issue is that all elements that can be dropped are dropped, the backing allocation of the Vec is deallocated, but the len of the Vec is not set to zero, so a double drop will try to double drop the elements again (and then the drop impl of RawVec will be invoked, whose capacity has not been set to zero, which will try to double-free memory).

I don't understand why you'd even want to call drop again, if we already agree that the first drop should do the maximal amount of dropping that it can. You are basically saying "make Vec's drop idempotent by making the second drop a NOP". I don't see the point.

Seems like you want to write generic code that drops stuff again if the first drop panicked. But what would be a situation where that is ever a good idea? From all I can see, the only place where this can help is if the second drop drops stuff that the first drop "missed" because of the panic. I am saying, if that is the problem, then fix drop to not "miss" stuff.

AFAICT, no. If you have a Vec<Pin<Box>> the only thing that will be deallocated without running destructors when that Vec is dropped is the storage of the Pin<Box>. Since the destructor of the Pin<Box> does not run, the destructor of the Box does not run, and the backing allocation of the Box gets leaked, which is fine (some other thread could still write to it).

This misses the point, because Box<T>: Unpin.

The interesting case is a Vec<IntrusiveListElement>. Then we could pin the Vec, and from our Pin<&mut Vec<IntrusiveListElement>> get a Pin<&mut IntrusiveListElement>, and insert that into the list. Now we are in a situation where if the Vec's backing store gets deallocated without dropping the IntrusiveListElement, we have a safety violation. But it seems to me if IntrusiveListElement::drop can guarantee that it itself does not panic, then we are okay: if an earlier element in the list panics while dropping (maybe it's a heterogeneous list through an enum or trait objects), we know we still get dropped properly.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 14, 2019

Now we are in a situation where if the Vec's backing store gets deallocated without dropping the IntrusiveListElement, we have a safety violation.

How can you drop the Vec<IntrusiveListElement> while Pin<&mut IntrusiveListElement>s into the vector are still live ?

One example is e.g. relying on Vec::push not to reallocate -- if we did a reserve before, is that guaranteed? What if we did reserve and then pop? And so on.

If we really consider such details to be stable just because they are observable by "this program was not UB but now it is", we'd have to bake an abstract model of all of these data structures into our operational semantics, just to make some more code UB.

The documentation of Vec guarantees these details, e.g., see Vec's capacity-and-reallocation section, so AFAICT Rust unsafe code can rely on these details, and breaking that code would be an API breaking change.

@RalfJung
Copy link
Member

How can you drop the Vec while Pin<&mut IntrusiveListElement>s into the vector are still live ?

I was talking about the implicit drop that happens when the vector goes out of scope.

The documentation of Vec guarantees these details, e.g., see Vec's capacity-and-reallocation section, so AFAICT Rust unsafe code can rely on these details, and breaking that code would be an API breaking change.

I am aware. When libraries decide to document such details, clients may of course rely on them. VecDeque has no such section even though many similar questions apply; thus, clients may not rely on the same properties for VecDeque even if they happen to be true currently.

This matches the situation for idempotent drop: clients may rely on this if and only if the library decides to document this as a guarantee.

I don't think we should change our language spec to detect clients relying on VecDeque implementation details, and similarly, I don't think we should change our language spec to detect clients that rely on some drop being idempotent even though that is not documented (e.g., double-drop of an empty Vec that had shrink_to_fit called on it).

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 14, 2019

https://rust-lang.github.io/rfcs/0320-nonzeroing-dynamic-drop.html (both the proposal that is now implemented an the description of the status quo before that) is relevant to the efforts that the language makes to not let double-Drop happen in safe Rust. This guarantee has existed for years, since before 1.0, so writers of unsafe code rely on it.

@SimonSapin The problem is that safe Rust can be called from unsafe Rust, and it isn't clear to me from that RFC whether writers of unsafe code can rely on other unsafe code not performing a double-drop. It isn't clear either whether unsafe code can assume that double-dropping something is ok.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 14, 2019

I am aware. When libraries decide to document such details, clients may of course rely on them. VecDeque has no such section even though many similar questions apply; thus, clients may not rely on the same properties for VecDeque even if they happen to be true currently.

Are you saying that this is analogous to whether users should be able to rely on double-drops invoking / not invoking UB? These data-structure properties feel quite obvious to me, but I have no idea how users can today learn that, at least when using the libstd types, they should always assume that Drop is not idempotent unless a type guarantees otherwise. AFAICT they can just try it for some type, and if it works, deduce that it is ok. Then they publish their crate, and some time later we break their code. Saying that "we did not guarantee that it worked anywhere" does not change the fact that the code now is broken, and we are not warning them about this either. So even if we might be right in that technically we are allowed to break that code, it might turn out that in practice, now we cannot, and that user has somehow managed to, by accident, specify that double-drops for some type must be ok. We'd have a stronger case if we explicitly call this out in the docs, warn when users do this (or panic or similar), etc.

@RalfJung
Copy link
Member

RalfJung commented May 14, 2019

These data-structure properties feel quite obvious to me, but I have no idea how users can today learn that, at least when using the libstd types, they should always assume that Drop is not idempotent unless a type guarantees otherwise.

Both seem equally obvious to me. We also don't document that you cannot push to a vector after dropping it. drop leaves the data structure is a "deinitialized" state in which no further operation can be performed on it. I don't understand why you or anyone else might think that dropping would be special in the sense of being the only operation that you may call on such a "deinitialized" data structure. I mean, nobody thinks that you can free an already free'd piece of the heap either, right?

Surely we don't have to exhaustively list everything we do not guarantee. That could get exhausting indeed...

I'm all for being more explicit, and in that sense I am okay with noting this explicitly somewhere. But in no way does that mean that other undocumented assumptions can randomly be conjured. (This is how the discussion at hand feels to me: coming up with a random assumption and then arguing that just because nothing says that you can not assume this, hence you can. I understand you do not consider it random, but I see nothing distinguishing this from any other not-documented-not-to-hold assumption that people might come up with.)

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 14, 2019 via email

@Centril
Copy link
Contributor

Centril commented May 15, 2019

We'd have a stronger case if we explicitly call this out in the docs, warn when users do this (or panic or similar), etc.

Surely we don't have to exhaustively list everything we do not guarantee. That could get exhausting indeed...

I mostly agree with @RalfJung, especially wrt. to turning "unstated non-guarantees" through random assumptions into guarantees. However, I think if there are reasonable assumptions that could be made that we don't want to guarantee, it's a practical idea to disabuse users of those assumptions.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented May 15, 2019

Note however that @gnzlbg themselves demonstrated that currently double-drops (of Vecs) don't work -- elements get double-dropped on the second drop. That means if a hypothetical users actually tries to determine this behavior experimentally, they'd quickly find double frees and similar issues if they are at all diligent.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 15, 2019

@rkruppe that holds as long as users try it with types for which this is the case.


I mostly agree with @RalfJung, especially wrt. to turning "unstated non-guarantees" through random assumptions into guarantees.

@Centril maybe you can explain to me how this assumption feels random to you (and maybe the others).

Rust allows fallible drop, that is, drop can fail. I don't understand how from this it follows that, because drop failed, one cannot attempt to drop the value again. The only idea I have about how this could be the case is if drop is infallible, that is, even if drop fails, the value is dropped, but then I don't understand why we allow drops to fail in the first place.

AFAICT, drop failed, therefore the value was not dropped, therefore I can try to drop it again, and that is not a double-drop because the value wasn't dropped in the first place. It is just a second attempt to drop a value, and this issue is about whether attempting to drop should be idempotent until drop succeeds (once the value is dropped, you don't longer own it, so attempting to drop further is a use-after-move kind of UB AFAICT).

@comex
Copy link
Contributor

comex commented May 15, 2019

If this is the case, to me this imply that a drop that fails actually succeeds, such that drops are infallible, which makes me wonder why do we allow drop to panic at all?

My impression is that it's been considered bad form in general to panic in a Drop implementation, if only because it causes an abort if the drop occurred during unwinding. It's allowed just because the alternative (always aborting when a Drop impl panics) is worse.

@comex
Copy link
Contributor

comex commented May 15, 2019

...however, this could certainly be documented better. Currently the documentation for Drop::drop has an oddly worded note:

Panics

Given that a panic! will call drop as it unwinds, any panic! in a drop implementation will likely abort.

I think it's trying to explain that double panic = abort, but as worded it doesn't really make sense.

It should probably explain that more clearly, and also explicitly state that panicking in drop is discouraged.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 15, 2019

It should probably explain that more clearly, and also explicitly state that panicking in drop is discouraged.

It should also cover the case in which the double panic does not happen, and suggest what users should assume if the details of a particular drop implementation are not guaranteed.

For example, that a dropped value for which drop fails must, in general, be leaked, because it is, in general, left in an unspecified partially-dropped state which is neither "initialized" nor "deinitialized" and which does not, in general, support an attempt to re-drop the value or any other operation.


It's allowed just because the alternative (always aborting when a Drop impl panics) is worse.

How bad would it be to make drop impls nounwind?

@RalfJung
Copy link
Member

You seem to be arguing that a drop that fails (panics) actually deinitializes the value and therefore trying to re-drop isn’t a logical operation to perform.

Well, at least I am arguing that that might be the case. You have no way to tell.

Would you expect it to be legal to push to a vector after drop panicked? I guess not. Same for calling drop again.

Rust allows fallible drop, that is, drop can fail. I don't understand how from this it follows that, because drop failed, one cannot attempt to drop the value again.

"panic" doesn't mean "I failed in a clean way and nothing happened", it means "something went wrong half-way through this operation and any part of it might or might not have happened".

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 15, 2019

Would you expect it to be legal to push to a vector after drop panicked? I guess not.

Why not? If the value was not dropped, then I'd expect that to work just like pushing into an empty vector would.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 15, 2019

"panic" doesn't mean "I failed in a clean way and nothing happened", it means "something went wrong half-way through this operation and any part of it might or might not have happened".

For Vec, on panic, the memory is deallocated, but the length and capacity of the vector are not set to 0. Are you saying that it is ok to leave safe abstractions over unsafe code in such an invalid state when a panic! happens?

@RalfJung
Copy link
Member

RalfJung commented May 15, 2019

Why not? If the value was not dropped, then I'd expect that to work just like pushing into an empty vector would.

"panic during drop" is very different from "not dropped". This is true for any operation. I don't understand why you would assume that an operation that panicked had no effect and thus can be considered as having not happened.

You keep just changing my words "drop panicked" into "not dropped". Please don't. :)

For Vec, on panic, the memory is deallocated, but the length and capacity of the vector are not set to 0. Are you saying that it is ok to leave safe abstractions over unsafe code in such an invalid state when a panic! happens?

I think it is okay to leave safe abstrcations in an invalid state when drop completes (whether that is return or unwind), because we can rely on this being the last operation. This is actually an advantage of Rust over C++, we don't need run-time tracking for this as we have proper static move tracking.

drop would be the only operation where the "return" and "unwind" ways of completing would have widely different guarantees. I see no grounds for assuming it is special.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 15, 2019

"panic during drop" is very different from "not dropped". This is true for any operation. I don't understand why you would assume that an operation that panicked had no effect

Where are you getting the impression that I assume that? How could you make Vec::drop have no effect? AFAICT you would need to revert the drop of the already dropped elements of the vector which is not possible.

because we can rely on this being the last operation.

How can we rely on this? The example I show using drop_in_place corrupts memory at run-time, and is neither rejected statically, nor at run-time (e.g. via a proper run-time error).

@RalfJung
Copy link
Member

RalfJung commented May 15, 2019

Where are you getting the impression that I assume that?

You keep saying "not dropped" for "drop panicked", making it sound as if a drop-that-panicked was the same thing as if drop had never been called. I am sorry if I misunderstood you here, but I also don't know any other way to interpret your words.

How can we rely on this? The example I show using drop_in_place corrupts memory at run-time, and is neither rejected statically, nor at run-time (e.g. via a proper run-time error).

The Rust compiler ensures that drop is called exactly once. It inserts boolean variables to track this at run-time if necessary. Calling Drop::drop manually from safe code is impossible (even though its signature looks like it should be possible).

Sure, unsafe code can circumvent this. Needless to say, unsafe code can circumvent any compiler-enforced restriction. But "drop is called exactly once" has been deeply entrenched in the guarantees and considerations around dropping in Rust as far back as I can remember. There is plenty of unsafe code in libstd that relies on this. This is as close to a guarantee as we get without formal proofs.

The docs for drop_in_place say "using the pointed-to value after calling drop_in_place can cause undefined behavior". It doesn't have an exception for when this operation panicked.

@RalfJung
Copy link
Member

RalfJung commented May 15, 2019

I am repeating myself here, but just to be clear: I am all for improving the docs. But I find it implausible to assume that doing double-drops could ever be legal (in generic code). There is plenty of evidence for code relying on no double drops happening, there is plenty of mechanism to make sure this doesn't happen in safe code, drop_in_place says so in the docs. I don't even know of a drop in libstd that would be safe to call twice (probably there are some). As a random example, MutexGuard::drop is also not idempotent and might release a lock not even held by the current thread if it gets called a second time.

The run-time tracking of the "move" state that the compiler does would literally be useless if drop was idempotent! Heck, before we had proper move tracking we even had a compiler-implemented mechanism to make drop idempotent (overwriting memory with some bit pattern, I forgot the details) and we got rid of it because of its unnecessary run-time cost.

It feels to me like you are altering the deal here, and the deal has been set and clear for many years. We should discuss how to improve the docs because clearly it is still possible to assume that drop might be idempotent , but there is no point discussing the deal itself.

Unfortunately, I do not understand at all where you got the idea from that drop might be idempotent, so I don't know where the docs could be improved to fix this. This issue has nothing to do with Vec::drop, it is more general. Where would you expect such things to be said? In drop_in_place? We basically already say that there but maybe the case where drop panics could be called out more explicitly. Or maybe the Nomicon should have a chapter on dropping?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 15, 2019

You keep saying "not dropped" for "drop panicked", making it sound as if a drop-that-panicked was the same thing as if drop had never been called.

By "not dropped" I precisely mean "a value was not dropped". A drop-that-panics is a drop that fails, which is not the same as if drop was never called (how could it panic if it was never called?).

Unfortunately, I do not understand at all where you got the idea from that drop might be idempotent, so I don't know where the docs could be improved to fix this. This issue has nothing to do with Vec::drop, it is more general. Where would you expect such things to be said? In drop_in_place? We basically already say that there but maybe the case where drop panics could be called out more explicitly. Or maybe the Nomicon should have a chapter on dropping?

I assumed a couple of things. First, that a drop that fails does not successfully drop a value, and since the value has not been dropped, code is allowed to try to drop it again. Again, this does not mean that I thought that a failed call to Drop::drop would have no effect here. It can have an effect, but that effect is not "the value is dropped" if the call fails - the effect is just "trying to drop the value failed", which is different from "drop was never called". Nothing nowhere taught me otherwise. For me, it just feel obvious that a value that has not been dropped (which is what AFAICT happens if calling drop for it fails), can be dropped just like any other value (unless these values are special somehow).

Second, I assumed that correct safe abstractions over unsafe code made sure this works. My own abstractions do this, e.g., SliceDeque is left in a valid but unspecified state if Drop::drop fails (len = capacity = 0, ptr = null, just like SliceDeque::new()). This means that, after Drop::drop for SliceDeque fails, you can call SliceDeque::push or drop the value without invoking undefined behavior. Last weekend, I added one of the Vec test to SliceDeque last week, and this test ran slower. The reason was that Vec does not put itself into a valid state after Drop fails, and this results in UB in the example shown, while SliceDeque was putting itself into a valid state in a sub-optimal way (enough to be noticeable, and easy to fix).

I then asked on Discord why was that, and whether the Drop impl for Vec has a bug, or whether Drop impls never have to account for that, and the discussion was that Drop for Vec is not idempotent, but whether this is by design or a bug was not clear, so I filled this bug.

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented May 15, 2019

@gnzlbg It seems, in other words, that you assumed the drop glue is exception safe in a fairly strong way (even when it fails, it upholds the safety invariants necessary for dropping). Of course, as you might guess from my phrasing, that seems like a pretty big and unwarranted assumption to me. We don't generally expect any Rust code to be very exception safe (that's why we have marker traits like UnwindSafe). Some basic exception safety is required from unsafe code to ensure that safe code mustn't be able to cause UB just by catching a panic, but no more. Yet to run the drop glue again after the drop glue fails, you have to dip into unsafe code, so that principle doesn't apply.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 15, 2019

@rkruppe The whole safe Vec API is exception safe in such a strong way (e.g. Vec::truncate). AFAICT, the only part of the API for which this does not hold is the impl of Drop for Vec, so I suppose the assumption might have come from there.

@hanna-kruppe
Copy link
Contributor

The entire safe API, yes, by necessity (and only as exception safe as required to maintain memory safety -- e.g., it will happily leak memory). Though there don't seem to be any unsafe APIs (or methods at least) that have to worry about panics, so I guess it's true that distinction might not be apparent just from looking at Vec.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented May 15, 2019

From the point-of-view of safe code, it does not matter whether Drop::drop is exception safe, because it can only be called once.

From the POV of unsafe code, I knew that double-drops cannot happen, but what I did not know is that what also cannot happen is Drop::drop getting called twice. I thought, of course that can happen, if unsafe code calls drop_in_place, and that panics, and then Drop::drop is called again, that will call Drop::drop twice, but is not a double-drop, because the first time the value failed to drop. When @RalfJung mentions that the issue is not double-drops, but calling Drop::drop more than once, that's news to me.

@Mark-Simulacrum
Copy link
Member

I would like to close this issue. We have rust-lang/reference#348 open to document the exact guarantees around drop and panicking, and it seems like that would be the place for this; I agree with others on this thread that I don't really see how one could assume that initiating drop twice in any way (whether it be due to unwinding or otherwise) is safe.

Specifically, any method which conceptually has a signature of T -> () or so, if it panics, shouldn't be called again assuming that you still have a T; that seems like it should be fairly easy to understand, unless I'm missing something, as the T is almost guaranteed to have dropped already (so it would be odd to assume that you can drop it again, as that calls something on that T -- you've lost ownership already).

I knew that double-drops cannot happen, but what I did not know is that what also cannot happen is Drop::drop getting called twice

This sentence is confusing to me -- double-drops cannot happen sounds equivalent to Drop::drop getting called twice; Drop::drop is the drop operation?

Specifically, I think the confusion lies here: " I thought, of course that can happen, if unsafe code calls drop_in_place, and that panics, and then Drop::drop is called again, that will call Drop::drop twice, but is not a double-drop, because the first time the value failed to drop" -- this is a double drop. When drop_in_place panics, it cannot assumed to have done nothing, so calling drop again means that you're violating the safety contract of drop_in_place. This is why ManuallyDrop or equivalents are commonly needed when calling drop_in_place, as you need to be sure that the value you're dropping in place is not dropped again (during unwinding or otherwise), otherwise your unsafe code is incorrect.

Presuming there's not further discussion here I will aim to close this issue in a week or so.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Dec 23, 2019

This sentence is confusing to me -- double-drops cannot happen sounds equivalent to Drop::drop getting called twice; Drop::drop is the drop operation?

In the context of this discussion, the question that needs to be properly answered somewhere in the reference is: If a call to Drop::drop fails, was the value dropped? If the answer is "no", then calling Drop::drop again isn't a "double-drop" because the value has not been dropped.

Looking at the comments here there is a certain amount of consensus that a Drop::drop call that fails "succeeds" in dropping the value, which is why even if Drop::drop fails, calling it again on a value drops the value twice and that's a double drop, which is UB. That's what should be documented somewhere in the reference.

Presuming there's not further discussion here I will aim to close this issue in a week or so.

Which part of the reference issue summarizes / super-seeds this discussion ? If there isn't any, could you post a summary of this discussion there before closing this, so that documenting this still gets tracked somewhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-collections Area: `std::collection` A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants