Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitizes tiered storage footer after reading from disk #34200

Merged
merged 2 commits into from
Dec 1, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion accounts-db/src/tiered_storage/error.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use {std::path::PathBuf, thiserror::Error};
use {super::footer::SanitizeFooterError, std::path::PathBuf, thiserror::Error};

#[derive(Error, Debug)]
pub enum TieredStorageError {
Expand All @@ -16,4 +16,10 @@ pub enum TieredStorageError {

#[error("Unsupported: the feature is not yet supported")]
Unsupported(),

#[error("invalid footer size: {0}, expected: {1}")]
InvalidFooterSize(u64, u64),

#[error("footer is unsanitary: {0}")]
SanitizeFooter(#[from] SanitizeFooterError),
}
184 changes: 173 additions & 11 deletions accounts-db/src/tiered_storage/footer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,12 @@ use {
error::TieredStorageError, file::TieredStorageFile, index::IndexBlockFormat,
mmap_utils::get_type, TieredStorageResult,
},
bytemuck::{Pod, Zeroable},
memmap2::Mmap,
num_enum::TryFromPrimitiveError,
solana_sdk::{hash::Hash, pubkey::Pubkey},
std::{mem, path::Path},
thiserror::Error,
};

pub const FOOTER_FORMAT_VERSION: u64 = 1;
Expand All @@ -22,7 +25,7 @@ pub const FOOTER_TAIL_SIZE: usize = 24;
/// The ending 8 bytes of a valid tiered account storage file.
pub const FOOTER_MAGIC_NUMBER: u64 = 0x502A2AB5; // SOLALABS -> SOLANA LABS

#[derive(Debug, PartialEq, Eq)]
#[derive(Debug, PartialEq, Eq, Clone, Copy, Pod, Zeroable)]
#[repr(C)]
pub struct TieredStorageMagicNumber(pub u64);

Expand Down Expand Up @@ -86,7 +89,7 @@ pub enum OwnersBlockFormat {
LocalIndex = 0,
}

#[derive(Debug, PartialEq, Eq, Clone)]
#[derive(Debug, PartialEq, Eq, Clone, Copy)]
yhchiang-sol marked this conversation as resolved.
Show resolved Hide resolved
#[repr(C)]
pub struct TieredStorageFooter {
// formats
Expand Down Expand Up @@ -144,6 +147,30 @@ pub struct TieredStorageFooter {
// pub magic_number: u64,
}

// It is undefined behavior to read/write uninitialized bytes.
// The `Pod` marker trait indicates there are no uninitialized bytes.
// In order to safely guarantee a type is POD, it cannot have any padding.
const _: () = assert!(
std::mem::size_of::<TieredStorageFooter>()
== std::mem::size_of::<AccountMetaFormat>()
+ std::mem::size_of::<OwnersBlockFormat>()
+ std::mem::size_of::<IndexBlockFormat>()
+ std::mem::size_of::<AccountBlockFormat>()
+ std::mem::size_of::<u32>() // account_entry_count
+ std::mem::size_of::<u32>() // account_meta_entry_size
+ std::mem::size_of::<u64>() // account_block_size
+ std::mem::size_of::<u32>() // owner_count
+ std::mem::size_of::<u32>() // owner_entry_size
+ std::mem::size_of::<u64>() // index_block_offset
+ std::mem::size_of::<u64>() // owners_block_offset
+ std::mem::size_of::<Pubkey>() // min_account_address
+ std::mem::size_of::<Pubkey>() // max_account_address
+ std::mem::size_of::<Hash>() // hash
+ std::mem::size_of::<u64>() // footer_size
+ std::mem::size_of::<u64>(), // format_version
"TieredStorageFooter cannot have any padding"
);

impl Default for TieredStorageFooter {
fn default() -> Self {
Self {
Expand Down Expand Up @@ -182,14 +209,19 @@ impl TieredStorageFooter {

pub fn new_from_footer_block(file: &TieredStorageFile) -> TieredStorageResult<Self> {
let mut footer_size: u64 = 0;
let mut footer_version: u64 = 0;
let mut magic_number = TieredStorageMagicNumber(0);

file.seek_from_end(-(FOOTER_TAIL_SIZE as i64))?;
file.read_type(&mut footer_size)?;
if footer_size != FOOTER_SIZE as u64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to first check the footer version here before the sanization.

If the version matches the current version, then we run the following sanitization check for the current version.

Otherwise, we run the sanitization check for that specific footer version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 251e28b.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think footer_version needs to be processed before footer_size (otherwise we actually don't know the correct size of that version). So we either:

  • move the footer_size check after footer_version check, or
  • swap the physical location of footer_size and footer_version.

Either way works for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functions to read values from the file (read_type and get_type) are designed for reading consecutive values/addresses/offsets. This makes reading a later address and then an earlier address cumbersome.

I think we should read (and sanitize) the fields in the order they are listed. Currently that is size and then version.

I agree that we probably should check the version first. I'd argue that change is orthogonal to this PR. IOW, we can (and should) change the field ordering (and thus read/sanitization ordering) in a separate PR. Wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we probably should check the version first. I'd argue that change is orthogonal to this PR.

Agree. It's needed only when we have a footer version with a different footer size.

return Err(TieredStorageError::InvalidFooterSize(
footer_size,
FOOTER_SIZE as u64,
));
}

let mut footer_version: u64 = 0;
let mut magic_number = TieredStorageMagicNumber::zeroed();
file.read_type(&mut footer_version)?;
file.read_type(&mut magic_number)?;

if magic_number != TieredStorageMagicNumber::default() {
return Err(TieredStorageError::MagicNumberMismatch(
TieredStorageMagicNumber::default().0,
Expand All @@ -200,13 +232,21 @@ impl TieredStorageFooter {
let mut footer = Self::default();
file.seek_from_end(-(footer_size as i64))?;
file.read_type(&mut footer)?;
Self::sanitize(&footer)?;

Ok(footer)
}

pub fn new_from_mmap(mmap: &Mmap) -> TieredStorageResult<&TieredStorageFooter> {
let offset = mmap.len().saturating_sub(FOOTER_TAIL_SIZE);
let (footer_size, offset) = get_type::<u64>(mmap, offset)?;
let (&footer_size, offset) = get_type::<u64>(mmap, offset)?;
if footer_size != FOOTER_SIZE as u64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. The size might be footer-format-version-dependent. Probably better to have one function handle this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 251e28b.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Check the version before the size.

return Err(TieredStorageError::InvalidFooterSize(
footer_size,
FOOTER_SIZE as u64,
));
}

let (_footer_version, offset) = get_type::<u64>(mmap, offset)?;
let (magic_number, _offset) = get_type::<TieredStorageMagicNumber>(mmap, offset)?;

Expand All @@ -217,13 +257,64 @@ impl TieredStorageFooter {
));
}

let (footer, _offset) = get_type::<TieredStorageFooter>(
mmap,
mmap.len().saturating_sub(*footer_size as usize),
)?;
let footer_offset = mmap.len().saturating_sub(footer_size as usize);
let (footer, _offset) = get_type::<TieredStorageFooter>(mmap, footer_offset)?;
Self::sanitize(footer)?;

Ok(footer)
}

/// Sanitizes the footer
///
/// Since the various formats only have specific valid values, they must be sanitized
/// prior to use. This ensures the formats are valid to interpret as (rust) enums.
fn sanitize(footer: &Self) -> Result<(), SanitizeFooterError> {
let account_meta_format_u16 =
unsafe { &*(&footer.account_meta_format as *const _ as *const u16) };
let owners_block_format_u16 =
unsafe { &*(&footer.owners_block_format as *const _ as *const u16) };
let index_block_format_u16 =
unsafe { &*(&footer.index_block_format as *const _ as *const u16) };
let account_block_format_u16 =
unsafe { &*(&footer.account_block_format as *const _ as *const u16) };

_ = AccountMetaFormat::try_from(*account_meta_format_u16)
.map_err(SanitizeFooterError::InvalidAccountMetaFormat)?;
_ = OwnersBlockFormat::try_from(*owners_block_format_u16)
.map_err(SanitizeFooterError::InvalidOwnersBlockFormat)?;
_ = IndexBlockFormat::try_from(*index_block_format_u16)
.map_err(SanitizeFooterError::InvalidIndexBlockFormat)?;
_ = AccountBlockFormat::try_from(*account_block_format_u16)
.map_err(SanitizeFooterError::InvalidAccountBlockFormat)?;

// Since we just sanitized the formats within the footer,
// it is now safe to read them as (rust) enums.
//
// from https://doc.rust-lang.org/reference/items/enumerations.html#casting:
// > If an enumeration is unit-only (with no tuple and struct variants),
// > then its discriminant can be directly accessed with a numeric cast;
//
// from https://doc.rust-lang.org/reference/items/enumerations.html#pointer-casting:
// > If the enumeration specifies a primitive representation,
// > then the discriminant may be reliably accessed via unsafe pointer casting
Ok(())
}
}

/// Errors that can happen while sanitizing the footer
#[derive(Error, Debug)]
pub enum SanitizeFooterError {
#[error("invalid account meta format: {0}")]
InvalidAccountMetaFormat(#[from] TryFromPrimitiveError<AccountMetaFormat>),

#[error("invalid owners block format: {0}")]
InvalidOwnersBlockFormat(#[from] TryFromPrimitiveError<OwnersBlockFormat>),

#[error("invalid index block format: {0}")]
InvalidIndexBlockFormat(#[from] TryFromPrimitiveError<IndexBlockFormat>),

#[error("invalid account block format: {0}")]
InvalidAccountBlockFormat(#[from] TryFromPrimitiveError<AccountBlockFormat>),
}

#[cfg(test)]
Expand Down Expand Up @@ -295,4 +386,75 @@ mod tests {
assert_eq!(offset_of!(TieredStorageFooter, footer_size), 0x90);
assert_eq!(offset_of!(TieredStorageFooter, format_version), 0x98);
}

#[test]
fn test_sanitize() {
// test: all good
{
let footer = TieredStorageFooter::default();
let result = TieredStorageFooter::sanitize(&footer);
assert!(result.is_ok());
}

// test: bad account meta format
{
let mut footer = TieredStorageFooter::default();
unsafe {
std::ptr::write(
&mut footer.account_meta_format as *mut _ as *mut u16,
0xBAD0,
);
}
let result = TieredStorageFooter::sanitize(&footer);
assert!(matches!(
result,
Err(SanitizeFooterError::InvalidAccountMetaFormat(_))
));
}

// test: bad owners block format
{
let mut footer = TieredStorageFooter::default();
unsafe {
std::ptr::write(
&mut footer.owners_block_format as *mut _ as *mut u16,
0xBAD0,
);
}
let result = TieredStorageFooter::sanitize(&footer);
assert!(matches!(
result,
Err(SanitizeFooterError::InvalidOwnersBlockFormat(_))
));
}

// test: bad index block format
{
let mut footer = TieredStorageFooter::default();
unsafe {
std::ptr::write(&mut footer.index_block_format as *mut _ as *mut u16, 0xBAD0);
}
let result = TieredStorageFooter::sanitize(&footer);
assert!(matches!(
result,
Err(SanitizeFooterError::InvalidIndexBlockFormat(_))
));
}

// test: bad account block format
{
let mut footer = TieredStorageFooter::default();
unsafe {
std::ptr::write(
&mut footer.account_block_format as *mut _ as *mut u16,
0xBAD0,
);
}
let result = TieredStorageFooter::sanitize(&footer);
assert!(matches!(
result,
Err(SanitizeFooterError::InvalidAccountBlockFormat(_))
));
}
}
}
4 changes: 2 additions & 2 deletions accounts-db/src/tiered_storage/hot.rs
Original file line number Diff line number Diff line change
Expand Up @@ -203,11 +203,11 @@ impl HotStorageReader {
pub fn new_from_path(path: impl AsRef<Path>) -> TieredStorageResult<Self> {
let file = OpenOptions::new().read(true).open(path)?;
let mmap = unsafe { MmapOptions::new().map(&file)? };
// Here we are cloning the footer as accessing any data in a
// Here we are copying the footer, as accessing any data in a
// TieredStorage instance requires accessing its Footer.
// This can help improve cache locality and reduce the overhead
// of indirection associated with memory-mapped accesses.
let footer = TieredStorageFooter::new_from_mmap(&mmap)?.clone();
let footer = *TieredStorageFooter::new_from_mmap(&mmap)?;

Ok(Self { mmap, footer })
}
Expand Down