Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TieredStorage] Write owners block for HotAccountStorage #34927

Merged
merged 3 commits into from
Jan 26, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 30 additions & 11 deletions accounts-db/src/tiered_storage/hot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ use {
index::{AccountIndexWriterEntry, AccountOffset, IndexBlockFormat, IndexOffset},
meta::{AccountMetaFlags, AccountMetaOptionalFields, TieredAccountMeta},
mmap_utils::{get_pod, get_slice},
owners::{OwnerOffset, OwnersBlockFormat},
owners::{OwnerOffset, OwnersBlockFormat, OwnersTable},
readable::TieredReadableAccount,
StorableAccounts, StorableAccountsWithHashesAndWriteVersions, TieredStorageError,
TieredStorageFormat, TieredStorageResult,
Expand Down Expand Up @@ -495,6 +495,7 @@ impl HotStorageWriter {
fn write_account(
&self,
lamports: u64,
owner_offset: OwnerOffset,
account_data: &[u8],
executable: bool,
rent_epoch: Option<Epoch>,
Expand All @@ -511,6 +512,7 @@ impl HotStorageWriter {
let padding_len = padding_bytes(account_data.len());
let meta = HotAccountMeta::new()
.with_lamports(lamports)
.with_owner_offset(owner_offset)
.with_account_data_size(account_data.len() as u64)
.with_account_data_padding(padding_len)
.with_flags(&flags);
Expand All @@ -527,8 +529,8 @@ impl HotStorageWriter {
Ok(stored_size)
}

/// A work-in-progress function that will eventually implements
/// AccountsFile::appends_account()
/// Persists the specified accounts from index [`skip`] into the underlying
/// hot accounts file associated with the HotStorageWriter.
brooksprumo marked this conversation as resolved.
Show resolved Hide resolved
pub fn write_accounts<
'a,
'b,
Expand All @@ -540,8 +542,11 @@ impl HotStorageWriter {
accounts: &StorableAccountsWithHashesAndWriteVersions<'a, 'b, T, U, V>,
skip: usize,
) -> TieredStorageResult<()> {
let zero_lamport_account_owner: Pubkey = Pubkey::default();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ha! I was wondering about this the other day and how we wanted to handle it. This basically guarantees every hot storage file will have one additional owner that represents "no owner". We could also optimize this away by storing some flag, or known owner's offset to indicate this account has no owner, or always using the first owner offset, since the value itself does not matter.

For this PR, I think this impl is fine.

Please make this a constant though. (And maybe renamed? Up to you.)

        const DEFAULT_PUBKEY: Pubkey = Pubkey::default();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tried this first, but Rust complains that I cannot do this:

   Compiling solana-accounts-db v1.18.0 (/Users/keelar/solana/yhchiang-sol/accounts-db)
error[E0015]: cannot call non-const fn `<solana_sdk::pubkey::Pubkey as std::default::Default>::default` in constants
   --> accounts-db/src/tiered_storage/hot.rs:545:40
    |
545 |         const DEFAULT_PUBKEY: Pubkey = Pubkey::default();
    |                                        ^^^^^^^^^^^^^^^^^
    |
    = note: calls in constants are limited to constant functions, tuple structs and tuple variants

Let me try the lazy_init again (I think I might have tried that before as well, forgot what was complained).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically guarantees every hot storage file will have one additional owner that represents "no owner". We could also optimize this away by storing some flag

I think one way to achieve this is to have one special value of OwnerOffset indicating no owner (probably OwnerOffset(u32::MAX) I think). This is probably a better way to handle it. Let me try to handle it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Damn. I need to pick some value other than u32::MAX because hot-storage stores owners offset using only 29 bits.

 struct HotMetaPackedFields {
     /// A hot account entry consists of the following elements:
     ///
     /// * HotAccountMeta
     /// * [u8] account data
     /// * 0-7 bytes padding
     /// * optional fields
     ///
     /// The following field records the number of padding bytes used
     /// in its hot account entry.
     padding: B3,
     /// The index to the owner of a hot account inside an AccountsFile.
     owner_offset: B29,
 }

In addition, a MAX value like this also becomes a special case in the boundary check. Will probably do it in a separate PR.

Copy link
Contributor Author

@yhchiang-sol yhchiang-sol Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Use const OWNER_NO_OWNER defined in owners.rs while keeping the OwnerOffset logic the same. We can continue the discussion and I will create a separate PR for how to avoid saving extra owner per file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually tried this first, but Rust complains that I cannot do this

Ah right, duh, my bad.

Damn. I need to pick some value other than u32::MAX because hot-storage stores owners offset using only 29 bits.

We can steal another bit from this struct. The owner offset doesn't need 29 bits, because that would exceed the file size limits. Is that the best use of this bit? I dunno! But it's an option, and we could change later if needed.

e.g.
16 GiB file size limit
if an account is only pubkey + owner + lamports, that's 32 + 32 + 8 = 72 bytes
16 GiB / 72 bytes per account = 238,609,294 accounts
and 2^28 = 268,435,456

So 28 bits can hold all the owners in the worst possible case (which practically cannot happen)

Copy link
Contributor Author

@yhchiang-sol yhchiang-sol Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought on this is that: I am not sure whether we might change the bits later when we already decided a const limit (for instance, 28 should be more than enough, but I am not sure whether we will squeeze it further for other purpose and change the limit to 27 bits).

If we want a const to represent the owner offset for no owner that we will never change, then 0 could be a good one? But as a downside, we also need to properly do -1 when reading any entry.

Copy link
Contributor

@brooksprumo brooksprumo Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current impl with the default Pubkey representing 'no owner' is fine. Note that the system program's address is the default pubkey, so we'll likely never be storing an unnecessary owner in the owners block.


let mut footer = new_hot_footer();
let mut index = vec![];
let mut owners_table = OwnersTable::default();
let mut cursor = 0;

// writing accounts blocks
Expand All @@ -555,20 +560,29 @@ impl HotStorageWriter {

// Obtain necessary fields from the account, or default fields
// for a zero-lamport account in the None case.
let (lamports, data, executable, rent_epoch, account_hash) = account
let (lamports, owner, data, executable, rent_epoch, account_hash) = account
.map(|acc| {
(
acc.lamports(),
acc.owner(),
acc.data(),
acc.executable(),
// only persist rent_epoch for those non-rent-exempt accounts
(acc.rent_epoch() != Epoch::MAX).then_some(acc.rent_epoch()),
Some(*account_hash),
)
})
.unwrap_or((0, &[], false, None, None));

cursor += self.write_account(lamports, data, executable, rent_epoch, account_hash)?;
.unwrap_or((0, &zero_lamport_account_owner, &[], false, None, None));

let owner_offset = owners_table.insert(owner);
cursor += self.write_account(
lamports,
owner_offset,
data,
executable,
rent_epoch,
account_hash,
)?;
index.push(index_entry);
}
footer.account_entry_count = (len - skip) as u32;
Expand All @@ -588,11 +602,13 @@ impl HotStorageWriter {
cursor += self.storage.write_pod(&0u32)?;
}

// TODO: owner block will be implemented in the follow-up PRs
// expect the offset of each block aligned.
// writing owners block
assert!(cursor % HOT_BLOCK_ALIGNMENT == 0);
footer.owners_block_offset = cursor as u64;
footer.owner_count = 0;
footer.owner_count = owners_table.len() as u32;
footer
.owners_block_format
.write_owners_block(&self.storage, &owners_table)?;

footer.write_footer_block(&self.storage)?;

Expand Down Expand Up @@ -1240,10 +1256,12 @@ pub mod tests {
/// and write_version.
fn create_test_account(seed: u64) -> (StoredMeta, AccountSharedData) {
let data_byte = seed as u8;
let owner_byte = u8::MAX - data_byte;
let account = Account {
lamports: seed + 1,
data: std::iter::repeat(data_byte).take(seed as usize).collect(),
owner: Pubkey::new_unique(),
// this will allow some test account sharing the same owner.
owner: [owner_byte; 32].into(),
executable: seed % 2 > 0,
rent_epoch: if seed % 3 > 0 {
seed
Expand Down Expand Up @@ -1319,6 +1337,7 @@ pub mod tests {
assert_eq!(stored_meta.data().len(), account.data().len());
assert_eq!(stored_meta.data(), account.data());
assert_eq!(stored_meta.executable(), account.executable());
assert_eq!(stored_meta.owner(), account.owner());
assert_eq!(stored_meta.pubkey(), address);
assert_eq!(stored_meta.hash(), hash);

Expand Down
10 changes: 10 additions & 0 deletions accounts-db/src/tiered_storage/owners.rs
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,16 @@ impl<'a> OwnersTable<'a> {

OwnerOffset(offset as u32)
}

/// Returns the number of unique owner addresses in the table.
pub fn len(&self) -> usize {
self.owners_set.len()
}

/// Returns true if the OwnersTable is empty
pub fn is_empty(&self) -> bool {
self.len() == 0
}
Comment on lines +110 to +113
Copy link
Contributor Author

@yhchiang-sol yhchiang-sol Jan 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: This function isn't used, but is required by the CI check as we have a len() function:


error: struct `OwnersTable` has a public `len` method, but no `is_empty` method
--
  | --> accounts-db/src/tiered_storage/owners.rs:102:5
  | \|
  | 102 \|     pub fn len(&self) -> usize {
  | \|     ^^^^^^^^^^^^^^^^^^^^^^^^^^
  | \|
  | = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#len_without_is_empty
  | = note: `-D clippy::len-without-is-empty` implied by `-D warnings`
  | = help: to override `-D warnings` add `#[allow(clippy::len_without_is_empty)]`

}

#[cfg(test)]
Expand Down
Loading