Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification about exception safety #16

Closed
dureuill opened this issue Oct 6, 2023 · 9 comments · Fixed by #17
Closed

Clarification about exception safety #16

dureuill opened this issue Oct 6, 2023 · 9 comments · Fixed by #17
Assignees

Comments

@dureuill
Copy link
Contributor

dureuill commented Oct 6, 2023

Hello,

Thank you for this article, it is a very interesting read that demonstrates very thorough work.

Unfortunately there's a claim that I'm unable to independently reproduce.

The article rates various implementations for a property called "exception safety", and states that implementations either leave the input unmodified, or may leave duplicated objects in the input.

In particular, the sort implementations of the C++ std are marked as not exception safe, which implies that they cause object duplication. This is a surprising result that the standard library would do this, because a tenet of C++ is to use move and copy operators when available, and avoid untyped bitwise copies.

Trying to replicate that result, this is not what I'm seeing for cpp_std_libcxx_stable at least.
I do see that the input is not preserved; it now contains "moved-out" items (nullptrs in the case of a unique_ptr), but it doesn't contain duplicated objects. Here's a link to a Godbolt with my experiment.

These two issues are very distinct, because the one I observe leaves the input in a "valid, but unspecified state", which is not amenable to UB if due diligence is taken, while object duplication of the input would necessarily cause UB by way of double free etc. "Exception safety" is generally understood as "an exception at a certain point in the program causes UB", not the stronger variant of "the input is left in its original state".

Unfortunately, as far as I can understand, the code of the benchmark only uses types that are trivially copyable; in C++ this can cause specialized codepaths to be used, which may duplicate elements of the input: these particular results are not generalizable to types that are not trivially (or at all) copyable.

Would you mind to augment the benchmark with results on non copyable types that would demonstrate the object duplication, or clarify in the text that no object duplication can occur in the sort implementations of the C++ standard library?

Additionally, in the interest of clarity, if the C++ standard library does not cause object duplication (for objects that are non-copyable), can you either enrich the rating for "exception safety" to distinguish between "will lead to UB" and the weaker "may not preserve the input", and also clarify that the article is using a stronger-than-usual definition of exception safety?

Apologies if there's anything I misunderstood or missed. Also, I understand that my quick attempt at reproducing your results might be insufficient, and that UB is an elusive beast that is not always easy to observe. Hence why I'm asking for a more reliable reproduction.

Again, thank you for your work on this article, which is in no way invalidated by this clarification request.

@Voultapher
Copy link
Owner

Voultapher commented Oct 6, 2023

Thanks for reaching out, that's a good question and should be clarified.

A bit of history to put this into context. I've mostly looked at algorithmic and code-gen details of the various C, C++ and Rust implementations I investigated over the last 1.5ish years. And have mostly been writing and refining sort implementations in Rust. In contrast to C++, Rust has a destructive move and deep copies are behind the Clone trait, while bit level copy semantics are expressed via the Copy trait. The standard library sort interface in Rust is defined as requiring the type T to implement the Ord trait. If you are not familiar with traits, think of them as C++ concepts in this context. It requires neither Copy nor Clone. But clearly somehow it has to swap values. If you have a mutable slice/span/view as input, you can safely swap elements by creating a temporary bitwise copy of the value. Having spent so much time with this idiom, I incorrectly assumed that would also apply to the C++ standard library sort interface. But the C++ standard library documentation states "The type of dereferenced RandomIt must meet the requirements of MoveAssignable and MoveConstructible.". Rust sort implementation authors that came before me and who's work my work is based on had to solve the problem of only being allowed to do bit level copies, even for types where doing so is semantically not a copy and must never be observed by the user. Here is an example, MergeHole is what is know as a drop guard that patches up the state on normal exit or unwind. As such there is no immediate distinction between trivially copyable types and those that are not. I hope this helps explain the origin of my mistake.

Given this, one could reasonably make a distinction between the different kinds of effects exceptions have for those two categories of types. However I'm not convinced doing so is in-line with the bigger picture here. The analysis explores what happens if users make mistakes and or misuse an interface. You say "which is not amenable to UB if due diligence is taken", the "if due diligence is taken" part is already incompatible with that perspective. Even assuming only types following C++ best practices, just sorting numbers may inadvertently also lead to UB in otherwise sound code. The example I provide here would conceptually be the exact same in C++. A memory arena with a destructor based on indices. Of course one can argue that the code is wrong or badly constructed, and that's exactly the point. The analysis asks what happens if users make mistakes, which they invariably do given enough code. Me opening a bug for it should be indicative of my opinion that this is a bug in the implementation, no matter Rust or C++. One could also argue that users should not be throwing exceptions in the comparison function. But ensuring the absence of transitive unmarked exceptions reliably over time as code evolves is exceptionally difficult. In my opinion giving people a language with unwinding and then applying arbitrary rules, they lack any language level tooling to follow, like "but don't use it here" is an untenable position for users.

I've created a PR #17, please let me know if that addresses your concerns.

@dureuill
Copy link
Contributor Author

dureuill commented Oct 6, 2023

Hello,

Thank you for the detailed and rapid response, I appreciate you dedicating time to the issue I raised.

As a full time Rust developer that switched from C++ out of concern for the issues created by the lack of memory safety of the latter language, rest assured that I fully understand what you mean when you speak of the importance of handling developer errors gracefully. In short, "make the API hard to misuse", which I believe is an important tenet of Rust.

That being said, reusing the "exception safety" terminology has the unfortunate side-effect of leading people already aware of that terminology to believe that the article claims that some of the C++ implementations of the std provide no exception safety whatsoever. People in this category that don't simultaneously have a good knowledge of the guarantees provided by the C++ std may disseminate that claim that is stronger than what the article intended. People that know of both the exception safety terminology and the guarantees of the std on the other hand, might dismiss the article on that basis, which would be a shame because it contains good information and is the result, as you said, of a lot of work.

I took a look at the PR you opened, and while it makes it clearer that you're not using the common definition of exception safety, I think this obscures a little whether the individual algorithms that fail this property suffer from item duplications (insta UB in C++) or "merely" from moved out items (surprising but recoverable).

I agree that in spirit, the lack of this form of exception safety is already an issue that should be mentioned, but I also think it should definitely be separated from the lack of the other form of exception safety.

There exists accepted terminology in C++ to differentiate between forms of exception safety:

  • "lack of exception safety": for code that invoke undefined behaviour in the presence of an exception
  • "basic exception safety": for code that does not invoke undefined behaviour, does not create memory leaks, and leaves all inputs in a valid, but unspecified state ("moved out" objects are in at least a valid, but unspecified state). The std::sort implementations fall in this bucket.
  • "strong exception safety": for code that has basic + leaves all inputs in their original state. This is the exception safety you're discussing.

Here's a source explaining the terminology: https://arne-mertz.de/2015/12/levels-of-exception-safety/, it is also featured on the wikipedia article for exception safety.

So I feel that the shortest path to fully address my concern would be to:

  1. Update the text on exception safety to clarify that you're considering strong exception safety, that basic exception safety also exists (the part about unique ptrs being duplicated occurs for code that don't uphold at least basic exception safety) and why.
  2. Update the table to specify "strong exception safety"

Optionally, I would love knowing which algorithms in the list uphold "no exception safety", vs "basic exception safety", vs "strong exception safety", but that's additional work to determine which is which, so that's not needed for the purpose of clarifying what is said on exception safety, only for the purpose of making the article even more informative.

Thank you for your first PR trying to address my concern, I wrote this message in the hope of fully clarifying the matter.


PS:

In my opinion giving people a language with unwinding and then applying arbitrary rules, they lack any language level tooling to follow, like "but don't use it here" is an untenable position for users.

Fully agreed, but I think the argument that strong exception safety is needed, actually (especially in a language that is not memory safe, ignoring UnwindSafety in Rust is less of a concern and we can in general be satisfied with regular "panic safety"), will be better articulated in its own piece, although the arena example you linked is definitely one piece of evidence.

The fact that strong exception safety does not impede performance for sorting compared with basic exception safety is a key take-away of your article that could be used in such a piece (although I'm unsure if it can be generalized to all situations).

@Voultapher
Copy link
Owner

Voultapher commented Oct 7, 2023

I agree with the points you make. Especially re-using existing terminology to avoid confusion. I'm open to split the table into basic and strong exception safety. I'm not entirely sure how to test the sort implementations that don't support move only types. Maybe a temporary change to the FFIStringCPP type that is only used for this analysis, because a functional one would be un-fair in the comparison benchmarks in my eyes, as it would force deep copies. Although that is a real consideration users would need to be aware of, that for example ips4o sorting std::string will create deep copies as it goes along. Alternatively one could add both kinds of strings, one with POD/string_view semantics and one with owning semantics. Alas doing so adds a lot of code I'm not too keen on supporting. Another question that arises when splitting into basic and strong exception safety, how does that affect the second kind of observation safety?

@dureuill
Copy link
Contributor Author

dureuill commented Oct 7, 2023

Although that is a real consideration users would need to be aware of, that for example ips4o sorting std::string will create deep copies as it goes along.

Yes I think it would make for an interesting footnote or even column in the table.

Another question that arises when splitting into basic and strong exception safety, how does that affect the second kind of observation safety?

Re-reading the section on the second kind of observation safety, I'm not 100% sure how it would be defined in the case where strong exception safety is missing. I see two possibilities:

  1. take into account that some elements are moved, and apply observation safety only to elements that were in the original input. In that case my hunch is that the C++ std implementations would meet that requirement. Or,
  2. forbid "moved out" elements for this property. Then the C++ std implementations, that don't have strong exception safety, fail these trivially as before

Actually, your questions raise a good point on the definition of strong exception safety I shared in my previous message.

Strong exception safety in C++ is often understood as "transactional semantics", where a failure due to an exception restores the object to its original state. For the input, it would mean that the original order would have been preserved and the elements would have been untouched by side effects of the comparison function. That's even stronger that the guarantee you've been using, that all the original items are still in the container, but now in unspecified order and with the side-effects of the comparison functions applied to the items that were compared. So in this hyper strong definition of exception safety, the one you've been using is merely a special case of "basic exception safety" (where the input is in an arbitrary state; here, a specific one).

I still think your definition have value because "have all the original items in an unspecified order" is what I would intuitively expect of an interrupted sort routine. I'm just unsure how we should label it now...

Ack, are we back to the drawing board?

@Voultapher
Copy link
Owner

Voultapher commented Oct 7, 2023

That's even stronger that the guarantee you've been using

I'm not sure that's true. You want the side-effects applied, otherwise you get direct UB emilk/drop-merge-sort#23.

To add a bit onto that, at least from a Rust perspective that's the way it must be, given in the linked code there is zero unsafe which means the implementation must make the modification visible. For C++ it's less clear what the semantics should be.

@Voultapher Voultapher self-assigned this Oct 22, 2023
@Voultapher
Copy link
Owner

Voultapher commented Oct 22, 2023

@dureuill as a short update. I've asked Klaus Iglberger for his opinion on the topic and am still waiting for feedback from him. I'd appreciate ideas from your side how to address the issue in the writeup. The original suggestion seems problematic to me in light of the ongoing discussion.

@dureuill
Copy link
Contributor Author

Ah apologies @Voultapher, I ran out of time to work on this. I'll try to suggest something during the week. I think we can both be more explicit on what guarantees we consider and why, and keep the additions constrained.

@Voultapher
Copy link
Owner

In the absence of suggestions, I took it upon myself to update the section on exception safety. Feel free to open a PR if you disagree with the current wording.

@dureuill
Copy link
Contributor Author

dureuill commented Nov 5, 2023

Hello,

Apologies for leaving you hanging there. I put in the time to update the section in a way that fully resolves my concern in #24

Please feel free to merge it or ask for modifications.
Please do as you prefer 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants