-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ABA-problem with pointer provenance in lockless queues #121950
Comments
Since this is mac-specific why not use a DWCAS? Afaik they don't have any hardware that doesn't have those. |
It's not mac-specific, though. Also, this is a provenance-only problem, so I'd like to avoid anything that makes the machine code less efficient, because in terms of what is actually generated, the current implementation is perfectly bug-free. It "just" violates the memory model. |
And using the return value from the CAS (the |
No, because the old pointer is stored inside the node, and we can't retroactively update the pointer, because a traversal operation could access the old one immediately after the |
I don't think LLVM implements expose semantics in any sense, so I suspect doing the expose dance here wouldn't change LLVM's view of the code. (Happy to be corrected on this) |
LLVM's pointer aliasing rules mention that the pointer resulting from |
That is assuming that we don't have optimizations that make full use of the provenance model. Or have you inspected the generated assembly to be sure? This looks extremely related to rust-lang/unsafe-code-guidelines#480 to me. I wonder if the CAS semantics I proposed here would help. |
I inspected the assembly of a simplified version (I removed the |
The age of an implementation is not an indicator of its soundness with respect to provenance rules. If anything, perhaps the correlation runs the other direction. I'm willing to review codegen tests for suspicious things like this, if someone can find a way to write them that's both effective and doesn't make them break often. I don't think we usually check in codegen tests for structures as complex as this. |
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-medium |
Just recording here that another possible solution has been sketched, also spelled out here. |
rwlock: disable 'frob' test in Miri on macOS Due to rust-lang#121950, Miri will sometimes complain about this test on macOS. Better disable the test, as otherwise it can fail for unrelated PRs. r? `@joboet`
rwlock: disable 'frob' test in Miri on macOS Due to rust-lang#121950, Miri will sometimes complain about this test on macOS. Better disable the test, as otherwise it can fail for unrelated PRs. r? ``@joboet``
Rollup merge of rust-lang#128640 - RalfJung:rwlock-macos-miri, r=joboet rwlock: disable 'frob' test in Miri on macOS Due to rust-lang#121950, Miri will sometimes complain about this test on macOS. Better disable the test, as otherwise it can fail for unrelated PRs. r? ``@joboet``
ignore an occasionally-failing test in Miri This is like rust-lang#128640, the test [sometimes](rust-lang#133189) fails due to rust-lang#121950.
Rollup merge of rust-lang#133200 - RalfJung:miri-rwlock-test, r=tgross35 ignore an occasionally-failing test in Miri This is like rust-lang#128640, the test [sometimes](rust-lang#133189) fails due to rust-lang#121950.
ignore an occasionally-failing test in Miri This is like rust-lang/rust#128640, the test [sometimes](rust-lang/rust#133189) fails due to rust-lang/rust#121950.
…=tgross35 miri: disable test_downgrade_observe test on macOS Due to rust-lang#121950, this test can fail on Miri. The test is also quite slow on Miri (taking more than 30s) due to the high iteration count (a total of 2000), so let's reduce that a little. Fixes rust-lang#133421
…=tgross35 miri: disable test_downgrade_observe test on macOS Due to rust-lang#121950, this test can fail on Miri. The test is also quite slow on Miri (taking more than 30s) due to the high iteration count (a total of 2000), so let's reduce that a little. Fixes rust-lang#133421
…=tgross35 miri: disable test_downgrade_observe test on macOS Due to rust-lang#121950, this test can fail on Miri. The test is also quite slow on Miri (taking more than 30s) due to the high iteration count (a total of 2000), so let's reduce that a little. Fixes rust-lang#133421
Rollup merge of rust-lang#133435 - RalfJung:test_downgrade_observe, r=tgross35 miri: disable test_downgrade_observe test on macOS Due to rust-lang#121950, this test can fail on Miri. The test is also quite slow on Miri (taking more than 30s) due to the high iteration count (a total of 2000), so let's reduce that a little. Fixes rust-lang#133421
Both lockless queue implementations inside
std
forOnce
andRwLock
suffer from the ABA problem, resulting in unsoundness.In both cases, the enqueue operation roughly looks like the following:
This code is problematic, because it allows the following to occur:
next
pointer of node 2.Now, any traversal operation on the queue that attempts to dereference the
next
pointer of node 2 exhibits UB, as the pointer, while having the right address, only has provenance to node 1, which is no longer valid, but may not access node 3.This is of little practical concern, as it is extremely unlikely that LLVM will perform a problematic optimization; but nevertheless the code is unsound, both under stacked-borrows and LLVM's semantics.
This is the cause of #121626, where this exact situation led to a (correct) UB report from miri.
@rustbot label +T-libs +I-unsound
Possible solutions
The only solution I see at the moment is to use fuzzy provenance and use
from_exposed_provenance
inside the traversal operation to make miri/LLVM guess the right provenance.The text was updated successfully, but these errors were encountered: