Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggest new wording for section on exception safety #24

Merged
merged 2 commits into from
Nov 6, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 14 additions & 3 deletions writeup/sort_safety/text.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ If the sort operation is understood as a series of swaps, C, D, E and F can all

### Exception safety

C++ and Rust are both languages with scope based destructors (RAII), and stack unwinding. Together they prove a powerful abstraction for manual memory management. At the same time, they can make implementing generic code more complex. Every single point in the sort implementation that calls the user-provided comparison function, must assume that the call may return via an exception in C++ or panic in Rust.
Exception safety encompasses the various guarantees one can make about the program state in the presence of exceptions. In languages with stack unwinding based exceptions like C++ and Rust, the concern for sort implementations is the state in which the input is left in when the user-provided comparison function raises an exception.

C++:

Expand All @@ -106,7 +106,18 @@ data.sort_by(|a, b| {
});
```

In practice a lack of exception safety manifests itself in the variants C and or D described in the section about Ord safety. In C++, types are considered either trivially copyable by the type system or not. For example `uint64_t` is and `std::string` isn't. In essence the question asked is, does copying the bits of the type suffice to have a meaningful new value, or must a user-defined copy or move operation be called. Some of the tested C++ implementations use this property to specialize their implementations and the logic changes accordingly. Assuming the user is using types that follow C++ best practices, this helps avoid direct UB, for example the tested `std::sort` implementations leave `std::string` values in a moved from state, which is safe to destroy, avoiding a potential double-free. This analysis does not consider this enough to mark an implementation as exception safe. The general theme is one of analyzing behavior in the presence of user mistakes, and types that don't follow C++ best practices are a common mistake in C++. In addition there are situation where users have no alternative but to interact with thirdparty libraries and or C code with limited or broken RAII semantics. Even assuming a world filled exclusively with C++ types following best practices, where duplicating integers will not directly lead to UB, it can still easily break adjacent assumptions made about a sort operation only re-arranging elements and not duplicating them, as shown [here](https://github.com/google/crumsort-rs/issues/1). The tested for exception safety fits neither the concept of basic nor strong exception safety. Leaving the input in some unspecified but safe to destroy state as required by basic exception safety, can be surprising and lead to adjacent UB. Returning the input to the original state as required by strong exception safety fails to account for mutation during the comparison that must be observed as shown [here](https://github.com/emilk/drop-merge-sort/issues/23).
The weakest guarantee is that an exception does not directly lead to undefined behavior, which is often referred to as "basic exception safety". The strongest kind of exception safety, known as "strong exception safety", models transactional semantics. With strong exception safety, the input must be returned to the state it was in before the operation was started.

For the purpose of this analysis, the guarantee being analyzed will be denoted as "intuitive exception safety". It encompasses the behavior assumed to be the intuitive result of interrupting a routine that rearranges elements. Once stack unwinding reaches user code, the input may be partially reordered, as if the sorting procedure had been interrupted. An implementation does not provide intuitive exception safety if the input after stack unwinding contains new duplicates or values that were not previously seen as part of the input. In the case of C++, this includes leaving some elements in the "moved out" state.

Failure to uphold intuitive exception safety can indirectly lead to undefined behavior in user code by violating the invariants enforced in the rest of the code. Examples include:

- The user is sorting a `std::vector` of move-only types like `std::unique_ptr`. If the user's code guarantees that, by construction, the pointers in that vector are not null, then it is permissible in user code to rely on this invariant and omit null checks. However, this is fraught in the presence of a sort implementation that doesn't verify the intuitive exception safety property. The input may unexpectedly contain null pointers after the comparison function throws.
- The user is sorting a container of numbers that serve as indices in a graph-like structure. If the user's code guarantees that, by construction, the indices in the container are never repeated, then it is permissible in user code to rely on this invariant and delete the node associated with each index without checking if it was previously deleted. However, this is fraught in the presence of a sort implementation that doesn't verify the intuitive exception safety property. The input may unexpectedly contain duplicated indices after the comparison function throws. This can also affect Rust implementations, even ones implemented with zero lines of `unsafe`, as shown [here](https://github.com/google/crumsort-rs/issues/1).

Failure to uphold intuitive exception safety might also directly lead to UB in C++, for types that by mistake don't follow C++ best practices for modeling types. This includes types from third-party libraries, the user has no control over, as well as interaction with owning C library types without appropriate RAII wrappers.

Intuitive exception safety differs from strong exception safety in a couple key ways. Firstly, it is impossible to rollback arbitrary side-effects caused by the user-provided comparison function. In Rust doing so would even lead to UB in purely safe code in combination with interior-mutability as shown [here](https://github.com/emilk/drop-merge-sort/issues/23). In addition, restoring the original state would require auxiliary memory to either track the original order of elements or to store a copy of them, which is impossible to acquire in no_std/freestanding environments.

### Observation safety

Expand Down Expand Up @@ -201,7 +212,7 @@ Properties:
- **Functional**: Does the implementation successfully pass the test suite of different input patterns and supported types?
- **Generic**: Does the implementation support arbitrary user-defined types?
- **Ord safety**: What happens if the user-defined type or comparison function does not implement a strict weak ordering. E.g. in C++ your comparison function does `[](const auto& a, const auto& b) { return a.x <= b.x; }`? O == unspecified order but original elements, E == exception/panic and unspecified order but original elements, U == Undefined Behavior usually out-of-bounds read and write, D unspecified order with duplicates. Only O and E are safe.
- **Exception safety**: What happens, if the user provided comparison function throws an exception/panic? ✅ means it retains the original input set in an unspecified order, 🚫 means it may have duplicated elements in the input.
- **Exception safety**: What happens, if the user provided comparison function throws an exception/panic? ✅ means it retains the original input set in an unspecified order, upholding the intuitiv exception safety property outlined above 🚫 means it may have duplicated or moved-out elements in the input.
- **Observable comp**: If the type has interior mutability, will every modification caused by calling the user-defined comparison function with const/shared references be visible in the input, after the sort function returns 1: normally 2: panic. If exception safety is not given, it is practically impossible to achieve 2. here.
- **Miri**: Does the test-suite pass if run under [Miri](https://github.com/rust-lang/Miri)? S: using the Stacked Borrows aliasing model. T: using the Tree Borrows aliasing model.

Expand Down
Loading