Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accounts-db: test_hash_stored_account: Avoid UB. #33083

Merged
merged 1 commit into from
Sep 6, 2023
Merged

accounts-db: test_hash_stored_account: Avoid UB. #33083

merged 1 commit into from
Sep 6, 2023

Conversation

ilya-bobyr
Copy link
Contributor

unsafe { transmute } in the test, as written, is undefined behavior. And, I think, we actually see the compiler generating different code for it as we are changing the generated hashes when we upgrade the compiler. It happened during upgrade to 1.67.1:

#29947

And the hashes changed again in 1.71.


Same as #33082, but numbers are sequential.

@apfitzge
Copy link
Contributor

Can you explain why there is UB in the first place?
Is the UB coming from the byte of our executable bool not being 0 or 1?

@ilya-bobyr
Copy link
Contributor Author

ilya-bobyr commented Sep 1, 2023

Can you explain why there is UB in the first place? Is the UB coming from the byte of our executable bool not being 0 or 1?

I thought that producing a value from bytes for types with Rust representation is undefined behavior. Because it writes into padding.
While it is based on some valid assumptions, this is an incorrect conclusion.
It is only an undefined behavior to write into padding through a pointer.
And transmute explicitly says that it may or may not populate the padding part - and that is OK.

So, it only leaves the undefined behavior of construction of an invalid bool value.

@apfitzge
Copy link
Contributor

apfitzge commented Sep 1, 2023

Is the UB on the bool what causes us to have different hashes for debug vs release?

I think I'm fine with this PR if we can remove that annoyance as well. In append_vec we sanitize executable to ensure higher 7-bits are 0 on loading, so testing that here is unnecessary. maybe we shouldn't have had a bool in the first place and just had a u8 bit-flags instead so we didn't even have to think about all this weirdness around bool.

@ilya-bobyr
Copy link
Contributor Author

Is the UB on the bool what causes us to have different hashes for debug vs release?

I think so.

After repr(C) was added, the layout should be consistent, otherwise it is no good for ABIs.
It is not something that can be optimized :)

At first, I thought, it might be that something deeper does not have a proper layout specifier.
Manual inspection showed no offenders.
Then I found there is a compiler attribute that makes the compiler print the layout information.

I've extracted all the relevant types, and you can run it in the playground now:

https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=aa9bcb4af338cb621195bbfbc0a33d88

It shows identical layout between debug and release.

Considering that when I remove unsafe, results are consistent between debug and release, I really do not have any explanations, except for the bool.

hash_account_data() actually uses the executable value:

        if executable {
            hasher.update(&[1u8; 1]);
        } else {
            hasher.update(&[0u8; 1]);
        }

This can be written differently by the debug and release compilation passes.
And as we pass in a value that is not 0 or 1, we are probably adding this value directly into the hash, rather than passing in 0 or 1.


Here goes a detailed explanation of the UB for bool.
Just to make sure we are on the same page :)

My understanding is that this UB is there so that the compiler could write assembly, always assuming that a bool is either 0 or 1.
I do not know the exact patters it uses.
But I can imagine a number of cases where a different value would cause all kinds of issues.

For example

pub fn f(three_not_two: bool, mut val: u64) -> u64 {
    if three_not_two {
        val += 3;
    } else {
        val += 2;
    }
    val
}

with -O compiles to

example::f:
        mov     eax, edi
        add     rax, rsi
        add     rax, 2
        ret

https://godbolt.org/z/4h3eerG11

If this rule is broken, combined with optimizations, it can lead to very strange results.
Another strange result can happen, if we use the resulting value as a jump offset.

I think I'm fine with this PR if we can remove that annoyance as well. In append_vec we sanitize executable to ensure higher 7-bits are 0 on loading, so testing that here is unnecessary. maybe we shouldn't have had a bool in the first place and just had a u8 bit-flags instead so we didn't even have to think about all this weirdness around bool.

In my mind, this is exactly the reason not to use disable compiler checks with unsafe.
If there is a way to do it that requires less mental load, in most cases, it will save time for everyone involved.

@ilya-bobyr
Copy link
Contributor Author

Rebased and added the lost 0x6a.

apfitzge
apfitzge previously approved these changes Sep 1, 2023
Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, previously relying on UB for the hashes because of the executable. Use manually specified bytes for non-bools and set the bool explicitly to false addresses this. No more need for different hashes between debug and release

@ryoqun
Copy link
Member

ryoqun commented Sep 3, 2023

Then I found there is a compiler attribute that makes the compiler print the layout information.

oh, #[rustc_layout(debug)] in #![feature(rustc_attrs)] sounds interesting. :) maybe, this is what rust-analyzer is using internally.

Here goes a detailed explanation of the UB for bool.
...

thanks for good write up. yeah, that aligns with my understanding.

ryoqun
ryoqun previously approved these changes Sep 3, 2023
Copy link
Member

@ryoqun ryoqun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with nits; thanks for writing a proper test, rebasing and some write-ups to convince the team.

I appreciate your willingness for removing unsafes or UBs. :)

`unsafe { transmute }` in the test, as written, is undefined behavior.
And, I think, we actually see the compiler generating different code for
it as we are changing the generated hashes when we upgrade the compiler.
It happened during upgrade to 1.67.1:

  #29947

And the hashes changed again in 1.71.
@ilya-bobyr
Copy link
Contributor Author

Added underscores and corrected the missing 6f byte.
Rebased on top of the latest master.

@ilya-bobyr
Copy link
Contributor Author

Then I found there is a compiler attribute that makes the compiler print the layout information.

oh, #[rustc_layout(debug)] in #![feature(rustc_attrs)] sounds interesting. :) maybe, this is what rust-analyzer is using internally.

What rust-analyzer functionality are you referring to?

Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ilya-bobyr ilya-bobyr merged commit d077b13 into solana-labs:master Sep 6, 2023
@ilya-bobyr ilya-bobyr deleted the pr/accounts-db-hash_account-no-ub-sequential branch September 6, 2023 03:30
@ryoqun
Copy link
Member

ryoqun commented Sep 7, 2023

What rust-analyzer functionality are you referring to?

this one: https://rust-analyzer.github.io/thisweek/2023/07/10/changelog-189.html#new-features

and lightly touched in our learning-corner as well: https://discord.com/channels/428295358100013066/977244255212937306/1146863043578445967

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants