-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
accounts-db: test_hash_stored_account: Avoid UB. #33083
accounts-db: test_hash_stored_account: Avoid UB. #33083
Conversation
Can you explain why there is UB in the first place? |
I thought that producing a value from bytes for types with Rust representation is undefined behavior. Because it writes into padding. So, it only leaves the undefined behavior of construction of an invalid |
Is the UB on the bool what causes us to have different hashes for debug vs release? I think I'm fine with this PR if we can remove that annoyance as well. In append_vec we sanitize |
I think so. After At first, I thought, it might be that something deeper does not have a proper layout specifier. I've extracted all the relevant types, and you can run it in the playground now: It shows identical layout between debug and release. Considering that when I remove
if executable {
hasher.update(&[1u8; 1]);
} else {
hasher.update(&[0u8; 1]);
} This can be written differently by the debug and release compilation passes. Here goes a detailed explanation of the UB for My understanding is that this UB is there so that the compiler could write assembly, always assuming that a bool is either 0 or 1. For example pub fn f(three_not_two: bool, mut val: u64) -> u64 {
if three_not_two {
val += 3;
} else {
val += 2;
}
val
} with example::f:
mov eax, edi
add rax, rsi
add rax, 2
ret https://godbolt.org/z/4h3eerG11 If this rule is broken, combined with optimizations, it can lead to very strange results.
In my mind, this is exactly the reason not to use disable compiler checks with |
Rebased and added the lost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, previously relying on UB for the hashes because of the executable. Use manually specified bytes for non-bools and set the bool explicitly to false addresses this. No more need for different hashes between debug and release
oh,
thanks for good write up. yeah, that aligns with my understanding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with nits; thanks for writing a proper test, rebasing and some write-ups to convince the team.
I appreciate your willingness for removing unsafe
s or UBs. :)
`unsafe { transmute }` in the test, as written, is undefined behavior. And, I think, we actually see the compiler generating different code for it as we are changing the generated hashes when we upgrade the compiler. It happened during upgrade to 1.67.1: #29947 And the hashes changed again in 1.71.
Added underscores and corrected the missing |
What |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
this one: https://rust-analyzer.github.io/thisweek/2023/07/10/changelog-189.html#new-features and lightly touched in our learning-corner as well: https://discord.com/channels/428295358100013066/977244255212937306/1146863043578445967 |
unsafe { transmute }
in the test, as written, is undefined behavior. And, I think, we actually see the compiler generating different code for it as we are changing the generated hashes when we upgrade the compiler. It happened during upgrade to 1.67.1:#29947
And the hashes changed again in 1.71.
Same as #33082, but numbers are sequential.