Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

accounts-db: test_hash_stored_account: Avoid UB. #33082

Closed
wants to merge 1 commit into from
Closed

accounts-db: test_hash_stored_account: Avoid UB. #33082

wants to merge 1 commit into from

Conversation

ilya-bobyr
Copy link
Contributor

@ilya-bobyr ilya-bobyr commented Aug 31, 2023

unsafe { transmute } in the test, as written, is undefined behavior. And, I think, we actually see the compiler generating different code for it as we are changing the generated hashes when we upgrade the compiler. It happened during upgrade to 1.67.1:

#29947

And the hashes changed again in 1.71.


Here is a version where numbers in the test are sequential, rather than completely random: #33083

`unsafe { transmute }` in the test, as written, is undefined behavior.
And, I think, we actually see the compiler generating different code for
it as we are changing the generated hashes when we upgrade the compiler.
It happened during upgrade to 1.67.1:

  #29947

And the hashes changed again in 1.71.
@ryoqun
Copy link
Member

ryoqun commented Aug 31, 2023

oh, i just noticed this pr (and #33083) after writing mine #33085...

your approach is quite good. i was just lazy to spell out all fields.

one minor concern is that this approach can't detect little-endian <=> big-endian. these types are mmaped with the assumption of little endian (i.e. amd64). so, if we wanted to support big endian ever (hence minor concern), this test should fail because of abi/endian differences, which can be do so only if we do transmute() (mmap equivalent), prompting the future developer to act on that platform differences. put differently, pure safe rust and proper hashing code are agnostic of these differences, which is almost always desirable, but not in this particular case.

i know this is very much of hypothetical.

for one thing, my #33085 seems immune to the past 1.66 => 1.67 change. anyway, I'm not feeling strong for keeping #33085's code eternally. ;) what do you think? do you still hate UB guaranteed unsafe? ;)

Copy link
Contributor

@apfitzge apfitzge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AppendVecStoredAccountMeta is supposed to be read from a memmapped file, current append-vec format uses a contiguous chunk of mem. This test appears to be replicating that intended use. Not married to that, since there's no requirement on the AppendVecStoredAccountMeta other than having references to the fields somewhere in memory, and the proposed tiered storage won't have contiguous chunks iirc.

That said, this change feels more complicated than necessary to me. If we're just going to assign our variables on the stack directly and reference them, why are we specifying each byte? Why not just give them a value i.e.:

let slot: Slot = 10;

@t-nelson
Copy link
Contributor

afaik the only "improvement" we have here is valid boolean values, which the test doesn't actually care about afaik.

agreed that this is a bit less scrutable. i was able to deduce the intent and intuit a possible cause fairly quickly, whereas if we hit the same issue with this code, it'd probably take me quite a bit longer to figure out why we have all of these explicitly declared "random" inputs

@ilya-bobyr
Copy link
Contributor Author

ilya-bobyr commented Sep 1, 2023

AppendVecStoredAccountMeta is supposed to be read from a memmapped file, current append-vec format uses a contiguous chunk of mem. This test appears to be replicating that intended use. Not married to that, since there's no requirement on the AppendVecStoredAccountMeta other than having references to the fields somewhere in memory, and the proposed tiered storage won't have contiguous chunks iirc.

My assumption is that it is better not to use unsafe when there is a way to write the same code without it.
I still think that in this particular case, it does not matter now AppendVecStoredAccountMeta is constructed in the rest of the code.
AppendVecStoredAccountMeta contains references.
It does not a block of memory that is then hashed.

In the test, transmute is used only as a way of populating the initial state that is then referenced from AppendVecStoredAccountMeta.
It does not matter if this initial state was written via a transmute or assigned directly.
Hashing logic in AccountsDb::hash_account() is using accessors, that return corresponding meta, account_meta and other objects pointed to by AppendVecStoredAccountMeta.

That said, this change feels more complicated than necessary to me. If we're just going to assign our variables on the stack directly and reference them, why are we specifying each byte? Why not just give them a value i.e.:

let slot: Slot = 10;

I thought, there is a reason we want all the bytes to be non-zero.
Not sure if it is really necessary.


afaik the only "improvement" we have here is valid boolean values, which the test doesn't actually care about afaik.

agreed that this is a bit less scrutable. i was able to deduce the intent and intuit a possible cause fairly quickly, whereas if we hit the same issue with this code, it'd probably take me quite a bit longer to figure out why we have all of these explicitly declared "random" inputs

I think that any unsafe is worse compared to code that does not have it.
In particular, on average, I would imagine, people should have more experience writing code without unsafe.
It seems that a pattern when test assigns random values that are then processed via some algorithm is a very common.


If both of you do not like this change, I can close the PR.

@ilya-bobyr
Copy link
Contributor Author

Closing in favor of #33083.

@ilya-bobyr ilya-bobyr closed this Sep 1, 2023
@ilya-bobyr ilya-bobyr deleted the pr/accounts-db-hash_account-no-ub branch September 1, 2023 22:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants