accounts-db: test_hash_stored_account: Avoid UB. #33082

ilya-bobyr · 2023-08-31T04:37:59Z

unsafe { transmute } in the test, as written, is undefined behavior. And, I think, we actually see the compiler generating different code for it as we are changing the generated hashes when we upgrade the compiler. It happened during upgrade to 1.67.1:

#29947

And the hashes changed again in 1.71.

Here is a version where numbers in the test are sequential, rather than completely random: #33083

`unsafe { transmute }` in the test, as written, is undefined behavior. And, I think, we actually see the compiler generating different code for it as we are changing the generated hashes when we upgrade the compiler. It happened during upgrade to 1.67.1: #29947 And the hashes changed again in 1.71.

ryoqun · 2023-08-31T13:39:17Z

oh, i just noticed this pr (and #33083) after writing mine #33085...

your approach is quite good. i was just lazy to spell out all fields.

one minor concern is that this approach can't detect little-endian <=> big-endian. these types are mmaped with the assumption of little endian (i.e. amd64). so, if we wanted to support big endian ever (hence minor concern), this test should fail because of abi/endian differences, which can be do so only if we do transmute() (mmap equivalent), prompting the future developer to act on that platform differences. put differently, pure safe rust and proper hashing code are agnostic of these differences, which is almost always desirable, but not in this particular case.

i know this is very much of hypothetical.

for one thing, my #33085 seems immune to the past 1.66 => 1.67 change. anyway, I'm not feeling strong for keeping #33085's code eternally. ;) what do you think? do you still hate UB guaranteed unsafe? ;)

apfitzge

AppendVecStoredAccountMeta is supposed to be read from a memmapped file, current append-vec format uses a contiguous chunk of mem. This test appears to be replicating that intended use. Not married to that, since there's no requirement on the AppendVecStoredAccountMeta other than having references to the fields somewhere in memory, and the proposed tiered storage won't have contiguous chunks iirc.

That said, this change feels more complicated than necessary to me. If we're just going to assign our variables on the stack directly and reference them, why are we specifying each byte? Why not just give them a value i.e.:

let slot: Slot = 10;

t-nelson · 2023-08-31T16:50:46Z

afaik the only "improvement" we have here is valid boolean values, which the test doesn't actually care about afaik.

agreed that this is a bit less scrutable. i was able to deduce the intent and intuit a possible cause fairly quickly, whereas if we hit the same issue with this code, it'd probably take me quite a bit longer to figure out why we have all of these explicitly declared "random" inputs

ilya-bobyr · 2023-09-01T01:15:56Z

AppendVecStoredAccountMeta is supposed to be read from a memmapped file, current append-vec format uses a contiguous chunk of mem. This test appears to be replicating that intended use. Not married to that, since there's no requirement on the AppendVecStoredAccountMeta other than having references to the fields somewhere in memory, and the proposed tiered storage won't have contiguous chunks iirc.

My assumption is that it is better not to use unsafe when there is a way to write the same code without it.
I still think that in this particular case, it does not matter now AppendVecStoredAccountMeta is constructed in the rest of the code.
AppendVecStoredAccountMeta contains references.
It does not a block of memory that is then hashed.

In the test, transmute is used only as a way of populating the initial state that is then referenced from AppendVecStoredAccountMeta.
It does not matter if this initial state was written via a transmute or assigned directly.
Hashing logic in AccountsDb::hash_account() is using accessors, that return corresponding meta, account_meta and other objects pointed to by AppendVecStoredAccountMeta.

That said, this change feels more complicated than necessary to me. If we're just going to assign our variables on the stack directly and reference them, why are we specifying each byte? Why not just give them a value i.e.:

let slot: Slot = 10;

I thought, there is a reason we want all the bytes to be non-zero.
Not sure if it is really necessary.

afaik the only "improvement" we have here is valid boolean values, which the test doesn't actually care about afaik.

agreed that this is a bit less scrutable. i was able to deduce the intent and intuit a possible cause fairly quickly, whereas if we hit the same issue with this code, it'd probably take me quite a bit longer to figure out why we have all of these explicitly declared "random" inputs

I think that any unsafe is worse compared to code that does not have it.
In particular, on average, I would imagine, people should have more experience writing code without unsafe.
It seems that a pattern when test assigns random values that are then processed via some algorithm is a very common.

If both of you do not like this change, I can close the PR.

ilya-bobyr · 2023-09-01T22:26:46Z

Closing in favor of #33083.

ilya-bobyr requested review from t-nelson and apfitzge August 31, 2023 04:37

ilya-bobyr mentioned this pull request Aug 31, 2023

accounts-db: test_hash_stored_account: Avoid UB. #33083

Merged

ryoqun mentioned this pull request Aug 31, 2023

Add #[repr(C)] for more future-proof byte mangling #33085

Merged

apfitzge reviewed Aug 31, 2023

View reviewed changes

ilya-bobyr closed this Sep 1, 2023

ilya-bobyr deleted the pr/accounts-db-hash_account-no-ub branch September 1, 2023 22:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

accounts-db: test_hash_stored_account: Avoid UB. #33082

accounts-db: test_hash_stored_account: Avoid UB. #33082

ilya-bobyr commented Aug 31, 2023 •

edited

Loading

ryoqun commented Aug 31, 2023 •

edited

Loading

apfitzge left a comment

t-nelson commented Aug 31, 2023

ilya-bobyr commented Sep 1, 2023 •

edited

Loading

ilya-bobyr commented Sep 1, 2023

accounts-db: test_hash_stored_account: Avoid UB. #33082

accounts-db: test_hash_stored_account: Avoid UB. #33082

Conversation

ilya-bobyr commented Aug 31, 2023 • edited Loading

ryoqun commented Aug 31, 2023 • edited Loading

apfitzge left a comment

Choose a reason for hiding this comment

t-nelson commented Aug 31, 2023

ilya-bobyr commented Sep 1, 2023 • edited Loading

ilya-bobyr commented Sep 1, 2023

ilya-bobyr commented Aug 31, 2023 •

edited

Loading

ryoqun commented Aug 31, 2023 •

edited

Loading

ilya-bobyr commented Sep 1, 2023 •

edited

Loading