-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification about exception safety #16
Comments
Thanks for reaching out, that's a good question and should be clarified. A bit of history to put this into context. I've mostly looked at algorithmic and code-gen details of the various C, C++ and Rust implementations I investigated over the last 1.5ish years. And have mostly been writing and refining sort implementations in Rust. In contrast to C++, Rust has a destructive move and deep copies are behind the Given this, one could reasonably make a distinction between the different kinds of effects exceptions have for those two categories of types. However I'm not convinced doing so is in-line with the bigger picture here. The analysis explores what happens if users make mistakes and or misuse an interface. You say "which is not amenable to UB if due diligence is taken", the "if due diligence is taken" part is already incompatible with that perspective. Even assuming only types following C++ best practices, just sorting numbers may inadvertently also lead to UB in otherwise sound code. The example I provide here would conceptually be the exact same in C++. A memory arena with a destructor based on indices. Of course one can argue that the code is wrong or badly constructed, and that's exactly the point. The analysis asks what happens if users make mistakes, which they invariably do given enough code. Me opening a bug for it should be indicative of my opinion that this is a bug in the implementation, no matter Rust or C++. One could also argue that users should not be throwing exceptions in the comparison function. But ensuring the absence of transitive unmarked exceptions reliably over time as code evolves is exceptionally difficult. In my opinion giving people a language with unwinding and then applying arbitrary rules, they lack any language level tooling to follow, like "but don't use it here" is an untenable position for users. I've created a PR #17, please let me know if that addresses your concerns. |
Hello, Thank you for the detailed and rapid response, I appreciate you dedicating time to the issue I raised. As a full time Rust developer that switched from C++ out of concern for the issues created by the lack of memory safety of the latter language, rest assured that I fully understand what you mean when you speak of the importance of handling developer errors gracefully. In short, "make the API hard to misuse", which I believe is an important tenet of Rust. That being said, reusing the "exception safety" terminology has the unfortunate side-effect of leading people already aware of that terminology to believe that the article claims that some of the C++ implementations of the std provide no exception safety whatsoever. People in this category that don't simultaneously have a good knowledge of the guarantees provided by the C++ std may disseminate that claim that is stronger than what the article intended. People that know of both the exception safety terminology and the guarantees of the std on the other hand, might dismiss the article on that basis, which would be a shame because it contains good information and is the result, as you said, of a lot of work. I took a look at the PR you opened, and while it makes it clearer that you're not using the common definition of exception safety, I think this obscures a little whether the individual algorithms that fail this property suffer from item duplications (insta UB in C++) or "merely" from moved out items (surprising but recoverable). I agree that in spirit, the lack of this form of exception safety is already an issue that should be mentioned, but I also think it should definitely be separated from the lack of the other form of exception safety. There exists accepted terminology in C++ to differentiate between forms of exception safety:
Here's a source explaining the terminology: https://arne-mertz.de/2015/12/levels-of-exception-safety/, it is also featured on the wikipedia article for exception safety. So I feel that the shortest path to fully address my concern would be to:
Optionally, I would love knowing which algorithms in the list uphold "no exception safety", vs "basic exception safety", vs "strong exception safety", but that's additional work to determine which is which, so that's not needed for the purpose of clarifying what is said on exception safety, only for the purpose of making the article even more informative. Thank you for your first PR trying to address my concern, I wrote this message in the hope of fully clarifying the matter. PS:
Fully agreed, but I think the argument that strong exception safety is needed, actually (especially in a language that is not memory safe, ignoring The fact that strong exception safety does not impede performance for sorting compared with basic exception safety is a key take-away of your article that could be used in such a piece (although I'm unsure if it can be generalized to all situations). |
I agree with the points you make. Especially re-using existing terminology to avoid confusion. I'm open to split the table into basic and strong exception safety. I'm not entirely sure how to test the sort implementations that don't support move only types. Maybe a temporary change to the |
Yes I think it would make for an interesting footnote or even column in the table.
Re-reading the section on the second kind of observation safety, I'm not 100% sure how it would be defined in the case where strong exception safety is missing. I see two possibilities:
Actually, your questions raise a good point on the definition of strong exception safety I shared in my previous message. Strong exception safety in C++ is often understood as "transactional semantics", where a failure due to an exception restores the object to its original state. For the input, it would mean that the original order would have been preserved and the elements would have been untouched by side effects of the comparison function. That's even stronger that the guarantee you've been using, that all the original items are still in the container, but now in unspecified order and with the side-effects of the comparison functions applied to the items that were compared. So in this hyper strong definition of exception safety, the one you've been using is merely a special case of "basic exception safety" (where the input is in an arbitrary state; here, a specific one). I still think your definition have value because "have all the original items in an unspecified order" is what I would intuitively expect of an interrupted sort routine. I'm just unsure how we should label it now... Ack, are we back to the drawing board? |
I'm not sure that's true. You want the side-effects applied, otherwise you get direct UB emilk/drop-merge-sort#23. To add a bit onto that, at least from a Rust perspective that's the way it must be, given in the linked code there is zero |
@dureuill as a short update. I've asked Klaus Iglberger for his opinion on the topic and am still waiting for feedback from him. I'd appreciate ideas from your side how to address the issue in the writeup. The original suggestion seems problematic to me in light of the ongoing discussion. |
Ah apologies @Voultapher, I ran out of time to work on this. I'll try to suggest something during the week. I think we can both be more explicit on what guarantees we consider and why, and keep the additions constrained. |
In the absence of suggestions, I took it upon myself to update the section on exception safety. Feel free to open a PR if you disagree with the current wording. |
Hello, Apologies for leaving you hanging there. I put in the time to update the section in a way that fully resolves my concern in #24 Please feel free to merge it or ask for modifications. |
Hello,
Thank you for this article, it is a very interesting read that demonstrates very thorough work.
Unfortunately there's a claim that I'm unable to independently reproduce.
The article rates various implementations for a property called "exception safety", and states that implementations either leave the input unmodified, or may leave duplicated objects in the input.
In particular, the sort implementations of the C++ std are marked as not exception safe, which implies that they cause object duplication. This is a surprising result that the standard library would do this, because a tenet of C++ is to use move and copy operators when available, and avoid untyped bitwise copies.
Trying to replicate that result, this is not what I'm seeing for
cpp_std_libcxx_stable
at least.I do see that the input is not preserved; it now contains "moved-out" items (
nullptr
s in the case of aunique_ptr
), but it doesn't contain duplicated objects. Here's a link to a Godbolt with my experiment.These two issues are very distinct, because the one I observe leaves the input in a "valid, but unspecified state", which is not amenable to UB if due diligence is taken, while object duplication of the input would necessarily cause UB by way of double free etc. "Exception safety" is generally understood as "an exception at a certain point in the program causes UB", not the stronger variant of "the input is left in its original state".
Unfortunately, as far as I can understand, the code of the benchmark only uses types that are trivially copyable; in C++ this can cause specialized codepaths to be used, which may duplicate elements of the input: these particular results are not generalizable to types that are not trivially (or at all) copyable.
Would you mind to augment the benchmark with results on non copyable types that would demonstrate the object duplication, or clarify in the text that no object duplication can occur in the sort implementations of the C++ standard library?
Additionally, in the interest of clarity, if the C++ standard library does not cause object duplication (for objects that are non-copyable), can you either enrich the rating for "exception safety" to distinguish between "will lead to UB" and the weaker "may not preserve the input", and also clarify that the article is using a stronger-than-usual definition of exception safety?
Apologies if there's anything I misunderstood or missed. Also, I understand that my quick attempt at reproducing your results might be insufficient, and that UB is an elusive beast that is not always easy to observe. Hence why I'm asking for a more reliable reproduction.
Again, thank you for your work on this article, which is in no way invalidated by this clarification request.
The text was updated successfully, but these errors were encountered: