Remove key_size() method from Column trait #34021

steviez · 2023-11-10T20:56:29Z

Problem

This helper simply called std::mem::size_ofSelf::Index(). However, all of the underlying functions that create keys manually copy fields into a byte array. The fields are copied in end-to-end whereas size_of() might include alignment bytes.

That is, a (u64, u32) only has 12 bytes of "data", but it would have size 16 due to the 4 alignment padding bytes that would be added to get the u32 (size 4) aligned with the u64 (size 8).

Summary of Changes

The helper could be useful, but in its' current state, it is incorrect and dangerous to leave around in that someone might make the incorrect assumption above in regards to alignment bytes.

Also, here is a Rust playground link to demonstrate that std::mem::size_of::<(u64, u32)>() == 16:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=b1848a2e974119930cc7e59c0b662274

This helper simply called std::mem::size_of<Self::Index>(). However, all of the underlying functions that create keys manually copy fields into a byte array. The fields are copied in end-to-end whereas size_of() might include alignment bytes. That is, a (u64, u32) only has 12 bytes of "data", but it would have size 16 due to the 4 alignment padding bytes that would be added to get the u32 (size 4) aligned with the u64 (size 8).

We iterate through key-value pairs anyways, so just get the key size from there.

codecov · 2023-11-17T07:20:44Z

Codecov Report

Merging #34021 (4dc7f09) into master (eb35a5a) will decrease coverage by 0.1%.
Report is 3 commits behind head on master.
The diff coverage is n/a.

Additional details and impacted files

@@            Coverage Diff            @@
##           master   #34021     +/-   ##
=========================================
- Coverage    81.9%    81.9%   -0.1%     
=========================================
  Files         818      818             
  Lines      219939   219936      -3     
=========================================
- Hits       180219   180163     -56     
- Misses      39720    39773     +53

CriesofCarrots · 2023-11-17T17:46:56Z

The helper could be useful

Would it actually be useful to have? If so, we could probably rework the default implementation to be correct by constructing an actual key and getting the len.

yhchiang-sol · 2023-11-17T18:06:59Z

Can we instead keep the key_size() API but just remove the default implementation?

And for each column-family, we implement its key_size() manually? (like std::mem::size_of::<u64> + std::mem::size_of::<u32> for slot-based column-families?)

steviez · 2023-11-17T18:07:18Z

Would it actually be useful to have? If so, we could probably rework the implementation to be correct by constructing an actual key and getting the len.

One spot for sure would be the hard-coded key length here:

solana/ledger/src/blockstore_db.rs

Line 1121 in 45290c4

let mut key = vec![0; 16];

We had a near miss in another PR where the index was updated for a new column, but the size of the key array was not and the key vector extra bytes before getting fixed. It was here if you're interested:#33979 (comment)

As were talking through, we agreed that it'd be nice to have a way to avoid that. If the key_size() method were accurate, then the vector could be initialized like:

let mut key = vec![0; Self::key_size()];

That's a good point that doing something like this would allow us to compute the value:

let key_len = Self::key(Self::as_index(0)).len();

However, I don't think we can compute this at compile time, and I think this would add overhead to what is a pretty fundamental function in creating a key. And, creating a key and computing the length would not have prevented agains the bug described above where the key vector was created with extra bytes.

Solutions that came to mind were something like:

A std::mem::size_of() function that excludes alignment bits (does not exist as far as I can tell)
Manually create a constant that sums of std::mem::size_of()'s (ie let SIZE = std::mem::size_of::<Pubkey>() + std::mem::size_of::<Slot>() + ...
- I think something like this came up when you re-keyed the transaction metadata columns, but we decided not to do anything as this could also be error-prone
A macro that runs through the fields of a struct / tuple, and sums up the std::mem::size_of() of each field

So, in the absence of a solution that could calculate key_size() in a way that was not prone to programmer error, I decided to rip out the faulty key_size() method. Ie, if we're going to manually maintain constants, no reason to maintain two copies

yhchiang-sol · 2023-11-17T18:07:54Z

ledger/src/blockstore_db.rs

@@ -719,10 +719,6 @@ impl Rocks {
 pub trait Column {
    type Index;

-    fn key_size() -> usize {
-        std::mem::size_of::<Self::Index>()


I think we should just simply remove this problematic default implementation.

CriesofCarrots

we decided not to do anything as this could also be error-prone

Yeah, all of the alternatives do have gaps, and none of them would really help the case you linked, where a key decreases in size.

As such, I'm fine with this.

steviez · 2023-11-17T18:13:54Z

Can we instead keep the key_size() API but just remove the default implementation?

And for each column-family, we implement its key_size() manually? (like std::mem::size_of::<u64> + std::mem::size_of::<u32> for slot-based column-families?)

Ha, you beat me to posting by a couple seconds. I thought of doing this in the past; however, the argument against it is that I could still see it being prone to the same bug from Ashwin's PR where:

Column originally created with type Index = (Slot, u64) (16 bytes)
KEY_LEN constant defined as size_of::<Slot>() + size_of::<u64>()
Index updated to type Index = (Slot, u32) (12 bytes)
Forget to update KEY_LEN constant after updating the index

Any unit test that checked the length of KEY_LEN would seemingly also have a 16 hard-coded in the test, so I go back to this not really getting us much. Or, if the test called Column::key() and compared against Column::KEY_LEN, they'd both be incorrect (16 bytes instead of 12 since one is built on the other) so the unit test still wouldn't fail

yhchiang-sol · 2023-11-17T18:44:59Z

Any unit test that checked the length of KEY_LEN would seemingly also have a 16 hard-coded in the test

I think I must miss something.

So if we write 8+4 bytes but read only 8+8 bytes, if we compare what we wrote and what we read, will there be a mis-match or it does not because it's based on Index?

steviez · 2023-11-17T19:20:27Z

Any unit test that checked the length of KEY_LEN would seemingly also have a 16 hard-coded in the test

I think I must miss something.

The code was originally something like this:

impl Column for columns::MerkleRootMeta {
    type Index = (Slot, /*fec_set_index:*/ u64);

    fn index(key: &[u8]) -> Self::Index {
        let slot = BigEndian::read_u64(&key[..8]);
        let fec_set_index = BigEndian::read_u64(&key[8..]);

        (slot, fec_set_index)
    }

    fn key((slot, fec_set_index): Self::Index) -> Vec<u8> {
        let mut key = vec![0; 16];
        BigEndian::write_u64(&mut key[..8], slot);
        BigEndian::write_u64(&mut key[8..], fec_set_index);
        key
    }

Noe that the second element of the Index is a u64; the size of the Index is 8+8 = 16 bytes. Then, it was updated to a u32.

impl Column for columns::MerkleRootMeta {
    type Index = (Slot, /*fec_set_index:*/ u32);

    fn index(key: &[u8]) -> Self::Index {
        let slot = BigEndian::read_u64(&key[..8]);
        let fec_set_index = BigEndian::read_u32(&key[8..]);

        (slot, fec_set_index)
    }

    fn key((slot, fec_set_index): Self::Index) -> Vec<u8> {
        let mut key = vec![0; 16];
        BigEndian::write_u64(&mut key[..8], slot);
        BigEndian::write_u32(&mut key[8..], fec_set_index);
        key
    }

The bug was that the following line was not updated:

        // Buggy
        let mut key = vec![0; 16];
        // Proper
        let mut key = vec[0; 12];

So if we write 8+4 bytes but read only 8+8 bytes, if we compare what we wrote and what we read, will there be a mis-match or it does not because it's based on Index?

Within key(), 12 bytes would have been written, but it was still a 16-byte buffer that was 0-initialized. When we read back in index(), we'll read out the first 12 bytes. These 12 bytes will match the Index that we wrote, but the fact is there are still an extra 4 bytes in the buffer.

yhchiang-sol · 2023-11-17T19:34:35Z

Within key(), 12 bytes would have been written, but it was still a 16-byte buffer that was 0-initialized. When we read back in index(), we'll read out the first 12 bytes. These 12 bytes will match the Index that we wrote, but the fact is there are still an extra 4 bytes in the buffer.

Let me see if I understand it correctly. So it's the mismatch between the length of the returned key() and Index, but we don't have a smart way to correctly compute the length of Index. Using manually implemented key_size() should work in this particular case, but if Index is updated but both array and key_size() is not updated, then we won't be able to detect it again.

Is my understanding correct?

steviez · 2023-11-17T22:24:45Z

Let me see if I understand it correctly. So it's the mismatch between the length of the returned key() and Index, but we don't have a smart way to correctly compute the length of Index. Using manually implemented key_size() should work in this particular case, but if Index is updated but both array and key_size() is not updated, then we won't be able to detect it again.

Is my understanding correct?

Yep, you got it. We could implement key_size() manually, but there is no way to enforce it is accurate (ie unit-test or compile time assert) without putting another hard-coded value in.

In this case, our unit test would contain the same value as the actual constant in source code. Unit test will only fail if you've update the source code constant. But, if you already updated the source code constant, then you remembered to do the right thing and the unit test didn't give any aid in helping you to remember to update the source code value

yhchiang-sol

Let's remove it. Given this function doesn't provide much value and isn't always consistent with the actual code.

steviez force-pushed the bstore_rm_footgun branch 2 times, most recently from f371161 to 50acbbc Compare November 17, 2023 05:58

Remove key_size() method from ledger-tool analyze_storage()

4dc7f09

We iterate through key-value pairs anyways, so just get the key size from there.

steviez force-pushed the bstore_rm_footgun branch from 50acbbc to 4dc7f09 Compare November 17, 2023 06:22

steviez marked this pull request as ready for review November 17, 2023 15:49

steviez requested review from yhchiang-sol and CriesofCarrots November 17, 2023 15:49

yhchiang-sol reviewed Nov 17, 2023

View reviewed changes

CriesofCarrots approved these changes Nov 17, 2023

View reviewed changes

yhchiang-sol approved these changes Nov 18, 2023

View reviewed changes

steviez merged commit 9a7b681 into solana-labs:master Nov 20, 2023
32 checks passed

steviez deleted the bstore_rm_footgun branch November 20, 2023 05:05

willhickey mentioned this pull request Mar 28, 2024

v1.18 commits - please ignore anza-xyz/agave#475

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove key_size() method from Column trait #34021

Remove key_size() method from Column trait #34021

steviez commented Nov 10, 2023 •

edited

Loading

codecov bot commented Nov 17, 2023

CriesofCarrots commented Nov 17, 2023 •

edited

Loading

yhchiang-sol commented Nov 17, 2023

steviez commented Nov 17, 2023

yhchiang-sol Nov 17, 2023

CriesofCarrots left a comment

steviez commented Nov 17, 2023

yhchiang-sol commented Nov 17, 2023 •

edited

Loading

steviez commented Nov 17, 2023

yhchiang-sol commented Nov 17, 2023

steviez commented Nov 17, 2023

yhchiang-sol left a comment

Remove key_size() method from Column trait #34021

Remove key_size() method from Column trait #34021

Conversation

steviez commented Nov 10, 2023 • edited Loading

Problem

Summary of Changes

codecov bot commented Nov 17, 2023

Codecov Report

CriesofCarrots commented Nov 17, 2023 • edited Loading

yhchiang-sol commented Nov 17, 2023

steviez commented Nov 17, 2023

yhchiang-sol Nov 17, 2023

Choose a reason for hiding this comment

CriesofCarrots left a comment

Choose a reason for hiding this comment

steviez commented Nov 17, 2023

yhchiang-sol commented Nov 17, 2023 • edited Loading

steviez commented Nov 17, 2023

yhchiang-sol commented Nov 17, 2023

steviez commented Nov 17, 2023

yhchiang-sol left a comment

Choose a reason for hiding this comment

steviez commented Nov 10, 2023 •

edited

Loading

CriesofCarrots commented Nov 17, 2023 •

edited

Loading

yhchiang-sol commented Nov 17, 2023 •

edited

Loading