Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscompilation on release profile when std::ptr::read is used to cast byte primitive to some tuples in unreachable code paths #127286

Closed
zheland opened this issue Jul 3, 2024 · 4 comments · Fixed by #127364
Assignees
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@zheland
Copy link

zheland commented Jul 3, 2024

std::ptr::read which is used to cast u8, i8 or bool to two-element tuples with u8, i8, u16 or i16 types leads to incorrect code generation when located in unreachable code paths (and not supposed to be called).

Replacing std::ptr::read to transmute_copy fixes the issue. Making conditions more simple can also fix the issue. Building with debug profile fixes the issue. Regressed since 1.70.0 (see Affected Rust versions section below).

This issue can lead to UB in creates which use std::ptr::read with generics for runtime casting/specialization.

Reproducible example

fn main() {
    assert!(core::any::TypeId::of::<u8>() != core::any::TypeId::of::<(u8, u8)>()); // OK
    assert!(!can_cast_u8_to_tuple_u8_u8()); // FAIL
}

// Optimized to `movb $1, %al; retq;` on release profile.
// Optimized to `xorl %eax, %eax; retq;` if `transmute_copy` is used instead.
pub fn can_cast_u8_to_tuple_u8_u8() -> bool {
    fn cast_u8_to_tuple_u8_u8(value: u8) -> Result<(u8, u8), u8> {
        if core::any::TypeId::of::<u8>() == core::any::TypeId::of::<(u8, u8)>() {
            Ok(unsafe { std::ptr::read(&value as *const u8 as *const (u8, u8)) })
        } else {
            Err(value)
        }
    }

    cast_u8_to_tuple_u8_u8(123_u8).is_ok()
}

Started with: cargo run --release
Expected: no output
Result: assertion failed

Reproducible example variations

False conditions with which the issue is reproduced (failure):

  • if core::any::TypeId::of::<u8>() == core::any::TypeId::of::<(u8, u8)>() {,
  • if core::any::TypeId::of::<u8>() == core::any::TypeId::of::<String>() {,
  • if core::any::type_name::<u8>() == core::any::type_name::<u64>() {,
  • if core::hint::black_box(false) { (but generate less optimized assembly),

False conditions with which the issue is NOT reproduced (ok):

  • if false {,
  • if core::convert::identity(false) {,

Сasts (actually unreachable) with which the issue is reproduced (failure):

  • Ok(unsafe { std::ptr::read(&value as *const u8 as *const (u8, u8)) })
  • Ok(unsafe { std::ptr::read(&value as *const u8 as *const (i8, u16)) })

Casts (actually unreachable) with which the issue is NOT reproduced (ok):

  • Ok(unsafe { std::ptr::read(&value as *const u8 as *const (u8, u8, u8)) })
  • Ok(unsafe { std::ptr::read(&value as *const u8 as *const u16) })
  • Ok(unsafe { core::mem::transmute_copy::<u8, (u8, u8)>(&value) })

Cloned implementation of transmute_copy with assert! removed will also lead
to the error, i.e. it's the assert that helps to avoid the issue when using transmute_copy:

#[inline]
#[must_use]
#[track_caller]
pub const unsafe fn affected_variant_of_transmute_copy<Src, Dst>(src: &Src) -> Dst {
    // assert!(
    //     std::mem::size_of::<Src>() >= std::mem::size_of::<Dst>(),
    //     "cannot transmute_copy if Dst is larger than Src"
    // );

    if std::mem::align_of::<Dst>() > std::mem::align_of::<Src>() {
        unsafe { std::ptr::read_unaligned(src as *const Src as *const Dst) }
    } else {
        unsafe { std::ptr::read(src as *const Src as *const Dst) }
    }
}

Affected Rust versions

  • 1.68.0 - not reproducible.
  • 1.69.0 - not reproducible.
  • 1.70.0 - reproducible.
  • 1.71.0 - reproducible.
  • Stable (rustc 1.79.0 (129f3b996 2024-06-10)) - reproducible.
  • Beta (rustc 1.80.0-beta.4 (64a1fe671 2024-06-21))- reproducible.
  • Nightly (rustc 1.81.0-nightly (6292b2af6 2024-07-02)) - reproducible.
$ rustc --version --verbose
rustc 1.79.0 (129f3b996 2024-06-10)
binary: rustc
commit-hash: 129f3b9964af4d4a709d1383930ade12dfe7c081
commit-date: 2024-06-10
host: x86_64-unknown-linux-gnu
release: 1.79.0
LLVM version: 18.1.7

Reproducible in Rust playground as well: https://play.rust-lang.org/?version=stable&mode=release&edition=2021&gist=7a2847de3ac47d87a157a8644e7fd668

Quick bisection between 1.69.0 and 1.70.0 with MRE:

$ cargo bisect-rustc --start 2023-03-03 --end=2023-04-14 -- run --release
...
RESULT: nightly-2023-03-03, ===> Successfully compiled
...
RESULT: nightly-2023-04-14, ===> Compile error
...
RESULT: nightly-2023-03-15, ===> Successfully compiled
...
RESULT: nightly-2023-03-16, ===> Compile error
...
********************************************************************************
Regression in nightly-2023-03-16
********************************************************************************
...
looking for regression commit between 2023-03-15 and 2023-03-16
fetching (via remote github) commits from max(1716932743a7b3705cbf0c34db0c4e070ed1930d, 2023-03-13) to ab654863c3d50482f260cf862647f1fe0ff5e010
ending github query because we found starting sha: 1716932743a7b3705cbf0c34db0c4e070ed1930d
get_commits_between returning commits, len: 6
  commit[0] 2023-03-14: Auto merge of #109130 - matthiaskrgr:rollup-dm3jza6, r=matthiaskrgr
  commit[1] 2023-03-15: Auto merge of #107376 - aliemjay:remove-givens, r=lcnr
  commit[2] 2023-03-15: Auto merge of #109089 - compiler-errors:opt_rpitit_info-follow-up, r=spastorino
  commit[3] 2023-03-15: Auto merge of #109035 - scottmcm:ptr-read-should-know-undef, r=WaffleLapkin,JakobDegen
  commit[4] 2023-03-15: Auto merge of #109164 - Dylan-DPC:rollup-0bwxwos, r=Dylan-DPC
  commit[5] 2023-03-15: Auto merge of #109169 - bjorn3:sync_cg_clif-2023-03-15, r=bjorn3
ERROR: no CI builds available between 1716932743a7b3705cbf0c34db0c4e070ed1930d and ab654863c3d50482f260cf862647f1fe0ff5e010 within last 167 days

Known affected crates

castaway

Casting some byte primitives to some two-element tuples unexpectedly succeeds on release builds and definitely leads to UB.
These assertions will success on debug builds, but will fail on release builds:

assert!(castaway::cast!(12_u8, (u8, u8)).is_err()); // assertion failed
assert!(castaway::cast!(23_u8, (i32, char)).is_err()); // Illegal instruction (core dumped)
assert!(castaway::cast!(34_i8, (usize, f64)).is_err()); // Illegal instruction (core dumped)
assert!(castaway::cast!(false, (bool, u16)).is_err()); // Illegal instruction (core dumped)

Castaway crate uses std::ptr::read to transmute between types when type ids are matched.

The source and target cast types which leads to issue

Unexpected cast success is reproduced for casts from T1 to (T2, T3)
where

  • T1 is u8, i8 or bool,
  • T2 is u8, i8, u16 or i16,
  • T3 is u8, i8, u16 or i16,

or where

  • T1 is u16, i16,
  • T2 is u8, i8, u16 or i16,
  • T3 is u8, i8, u16 or i16 but has different byte size than T2

Check code:

fn main() {
    test1();
}

fn test1() {
    macro_rules! repeat { ( $( $ty:ty ),* ) => { $( test2::<$ty>(); )* }; }
    repeat!(u8, u16, u32, u64, u128, usize, i8, i16, i32, i64, i128, isize, f32, f64, bool, char);
}

fn test2<T1: 'static + Default>() {
    macro_rules! repeat { ( $( $ty:ty ),* ) => { $( test3::<T1, $ty>(); )* }; }
    // Adding `bool` or `char` here will result in `Illegal instruction (core dumped)` runtime error.
    repeat!(u8, u16, u32, u64, u128, usize, i8, i16, i32, i64, i128, isize, f32, f64);
}

fn test3<T1: 'static + Default, T2>() {
    macro_rules! repeat { ( $( $ty:ty ),* ) => { $( test4::<T1, T2, $ty>(); )* }; }
    repeat!(u8, u16, u32, u64, u128, usize, i8, i16, i32, i64, i128, isize, f32, f64, bool, char);
}

fn test4<T1: 'static + Default, T2, T3>() {
    if can_cast_t1_to_tuple_t2_t3::<T1, T2, T3>() {
        println!(
            "unexpected cast success from {:5} to ({:5}, {:5})",
            core::any::type_name::<T1>(),
            core::any::type_name::<T2>(),
            core::any::type_name::<T3>()
        );
    }
}

#[inline(never)]
fn can_cast_t1_to_tuple_t2_t3<T1: Default, T2, T3>() -> bool {
    fn cast<T1: Default, T2, T3>(value: T1) -> Result<(T2, T3), T1> {
        if core::any::TypeId::of::<u8>() == core::any::TypeId::of::<(u8, u8)>() {
            Ok(unsafe { std::ptr::read(&value as *const T1 as *const (T2, T3)) })
        } else {
            Err(value)
        }
    }

    cast::<T1, T2, T3>(T1::default()).is_ok()
}
@zheland zheland added the C-bug Category: This is a bug. label Jul 3, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jul 3, 2024
@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jul 3, 2024
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jul 3, 2024
@nikic
Copy link
Contributor

nikic commented Jul 3, 2024

From a brief glance, this looks like a SROA bug to me: https://llvm.godbolt.org/z/o1Poz5vzn We should not be folding to poison.

(Alive2 doesn't flag this, but I believe that's a bug in alive2. Filed AliveToolkit/alive2#1067.)

@scottmcm
Copy link
Member

scottmcm commented Jul 3, 2024

If this showed up in 1.70, it was probably exposed by #109035 since that changed ptr::read from memcpy to a normal load.

(EDIT: doh, you'd already bisected to the rollup with that PR.)

@nikic
Copy link
Contributor

nikic commented Jul 4, 2024

Upstream issue: llvm/llvm-project#97702

@apiraino
Copy link
Contributor

apiraino commented Jul 4, 2024

WG-prioritization assigning priority (Zulip discussion).

@rustbot label -I-prioritize +P-high

@rustbot rustbot added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 4, 2024
@nikic nikic added the llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes label Jul 4, 2024
@nikic nikic self-assigned this Jul 4, 2024
sagebind pushed a commit to sagebind/castaway that referenced this issue Jul 4, 2024
…#25)

Currently some casts from byte primitives to two element tuple lead to
miscompilation on **release** builds on Rust `>=1.70.0`:
- `castaway::cast!(123_u8, (u8, u8))` unexpectedly returns `Ok(...)`
that leads to **UB**.
- `castaway::cast!(false, (bool, u16))` leads to `SIGILL: illegal
instruction` runtime error.

Upstream issues:
- Rust: rust-lang/rust#127286
- LLVM: llvm/llvm-project#97702

I suggest considering adding a safe "workaround" to fix the issue in
this crate without having to wait for the upstream fixes. This way we
will have this fixed in older Rust versions as well.

This PR adds size eq `assert` to `transmute_unchecked`. This workaround
was found while preparing an MRE for an upstream issue.

Checked locally with `cargo test --release` for Rust `1.38`, `1.68.0`,
`1.69.0`, `1.70.0`, `1.71.0`, `1.72.0`, `stable`, `beta`, `nightly`.
Generated assembly for other tests cases for the release build seems the
same (checks and casts are optimized away).

Btw: it might also be a good idea to run tests in `--release` mode as
well since the crate relies heavily on optimizing the casts to
zero-cost.
@bors bors closed this as completed in 5c08cc7 Jul 6, 2024
bors-ferrocene bot added a commit to ferrocene/ferrocene that referenced this issue Aug 13, 2024
838: Add tests for backports r=skade a=Veykril

- `tests/ui/ferrocene/llvm/no-segfault.rs` tests [issue#127260](rust-lang/rust#127260) which was fixed by the backported [PR#127364](rust-lang/rust#127364) 
- `tests/ui/ferrocene/consts/storage-live-on-live.rs` tests [issue#119366](rust-lang/rust#119366) which was fixed by the backported [PR#126154](rust-lang/rust#126154)
- `tests/codegen/ferrocene/miscompile_127286.rs` tests [issue#127286](rust-lang/rust#127286) which was fixed by the backported [PR#127364](rust-lang/rust#127364) 

Co-authored-by: Lukas Wirth <[email protected]>
bors-ferrocene bot added a commit to ferrocene/ferrocene that referenced this issue Aug 13, 2024
838: Add tests for backports r=skade a=Veykril

- `tests/ui/ferrocene/llvm/no-segfault.rs` tests [issue#127260](rust-lang/rust#127260) which was fixed by the backported [PR#127364](rust-lang/rust#127364) 
- `tests/ui/ferrocene/consts/storage-live-on-live.rs` tests [issue#119366](rust-lang/rust#119366) which was fixed by the backported [PR#126154](rust-lang/rust#126154)
- `tests/codegen/ferrocene/miscompile_127286.rs` tests [issue#127286](rust-lang/rust#127286) which was fixed by the backported [PR#127364](rust-lang/rust#127364) 

Co-authored-by: Lukas Wirth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness llvm-fixed-upstream Issue expected to be fixed by the next major LLVM upgrade, or backported fixes P-high High priority T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants