-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strictly sanitize mmapped AppendVec file contents #7464
Changes from 5 commits
8af4cf8
bc956ca
a054cab
2b301a1
e234d35
b1700ed
e0aca2a
6d62daa
688742b
80877f6
dadb2e0
5785872
581ae43
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,11 +13,12 @@ use std::{ | |
sync::Mutex, | ||
}; | ||
|
||
//Data is aligned at the next 64 byte offset. Without alignment loading the memory may | ||
//Data placement should be aligned at the next boundary. Without alignment accessing the memory may | ||
//crash on some architectures. | ||
macro_rules! align_up { | ||
($addr: expr, $align: expr) => { | ||
($addr + ($align - 1)) & !($align - 1) | ||
const ALIGN_BOUNDARY_OFFSET: usize = mem::size_of::<u64>(); | ||
macro_rules! align_to_8byte { | ||
ryoqun marked this conversation as resolved.
Show resolved
Hide resolved
|
||
($addr: expr) => { | ||
($addr + (ALIGN_BOUNDARY_OFFSET - 1)) & !(ALIGN_BOUNDARY_OFFSET - 1) | ||
}; | ||
} | ||
|
||
|
@@ -70,6 +71,17 @@ impl<'a> StoredAccount<'a> { | |
hash: *self.hash, | ||
} | ||
} | ||
|
||
fn sanitize(&self) -> bool { | ||
// Sanitize executable | ||
// Use extra references to avoid value silently cramped to 1 (=true) and 0 (=false) | ||
ryoqun marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// Yes, this really happens; see test_set_file_crafted_executable | ||
let executable_bool: &bool = &self.account_meta.executable; | ||
// UNSAFE: Force to interpret mmap-backed bool as u8 to ensure higher 7-bits are cleared correctly. | ||
let executable_byte: &u8 = unsafe { &*(executable_bool as *const bool as *const u8) }; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This unsafe is in production code path. But risk should have been minimized; it only reads a byte of memory with narrowest scoping. |
||
|
||
executable_byte & !1 == 0 | ||
} | ||
} | ||
|
||
#[derive(Debug)] | ||
|
@@ -187,17 +199,39 @@ impl AppendVec { | |
|
||
let map = unsafe { MmapMut::map_mut(&data)? }; | ||
self.map = map; | ||
|
||
if !self.sanitize_layout_and_length() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This adds additional sanitization costs for the snapshot ingestion codepath. Its impact on the overall validator performance should be minimal because it's only done only once when starting a validator from snapshot. This PR intentionally didn't added these checks for the actual AppendVec write codepath for the performance concerns and its dubious merits. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also this PR didn't add these check for snapshot generation code path as well with the same reason. |
||
return Err(std::io::Error::new( | ||
std::io::ErrorKind::Other, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know using those Errors is a bit off... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea.. I would prefer using either a custom Result type or maybe even something like |
||
"incorrect layout/length", | ||
)); | ||
} | ||
|
||
Ok(()) | ||
} | ||
|
||
fn sanitize_layout_and_length(&self) -> bool { | ||
let mut offset = 0; | ||
|
||
while let Some((account, next_offset)) = self.get_account(offset) { | ||
ryoqun marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if !account.sanitize() { | ||
return false; | ||
} | ||
offset = next_offset; | ||
} | ||
let aligned_current_len = align_to_8byte!(self.current_len.load(Ordering::Relaxed)); | ||
|
||
offset == aligned_current_len | ||
} | ||
|
||
fn get_slice(&self, offset: usize, size: usize) -> Option<(&[u8], usize)> { | ||
if offset + size > self.len() { | ||
let (next, overflow) = offset.overflowing_add(size); | ||
if overflow || next > self.len() { | ||
return None; | ||
} | ||
let data = &self.map[offset..offset + size]; | ||
//Data is aligned at the next 64 byte offset. Without alignment loading the memory may | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IMO, these comments are redundant at best; so removed them. |
||
//crash on some architectures. | ||
let next = align_up!(offset + size, mem::size_of::<u64>()); | ||
let data = &self.map[offset..next]; | ||
let next = align_to_8byte!(next); | ||
|
||
Some(( | ||
//UNSAFE: This unsafe creates a slice that represents a chunk of self.map memory | ||
//The lifetime of this slice is tied to &self, since it points to self.map memory | ||
|
@@ -207,9 +241,7 @@ impl AppendVec { | |
} | ||
|
||
fn append_ptr(&self, offset: &mut usize, src: *const u8, len: usize) { | ||
//Data is aligned at the next 64 byte offset. Without alignment loading the memory may | ||
//crash on some architectures. | ||
let pos = align_up!(*offset as usize, mem::size_of::<u64>()); | ||
let pos = align_to_8byte!(*offset); | ||
let data = &self.map[pos..(pos + len)]; | ||
//UNSAFE: This mut append is safe because only 1 thread can append at a time | ||
//Mutex<append_offset> guarantees exclusive write access to the memory occupied in | ||
|
@@ -224,19 +256,15 @@ impl AppendVec { | |
fn append_ptrs_locked(&self, offset: &mut usize, vals: &[(*const u8, usize)]) -> Option<usize> { | ||
let mut end = *offset; | ||
for val in vals { | ||
//Data is aligned at the next 64 byte offset. Without alignment loading the memory may | ||
//crash on some architectures. | ||
end = align_up!(end, mem::size_of::<u64>()); | ||
end = align_to_8byte!(end); | ||
end += val.1; | ||
} | ||
|
||
if (self.file_size as usize) < end { | ||
return None; | ||
} | ||
|
||
//Data is aligned at the next 64 byte offset. Without alignment loading the memory may | ||
//crash on some architectures. | ||
let pos = align_up!(*offset, mem::size_of::<u64>()); | ||
let pos = align_to_8byte!(*offset); | ||
for val in vals { | ||
self.append_ptr(offset, val.0, val.1) | ||
} | ||
|
@@ -389,8 +417,8 @@ impl Serialize for AppendVec { | |
S: serde::ser::Serializer, | ||
{ | ||
use serde::ser::Error; | ||
let len = std::mem::size_of::<usize>() as u64; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These casts are odd... |
||
let mut buf = vec![0u8; len as usize]; | ||
let len = std::mem::size_of::<usize>(); | ||
let mut buf = vec![0u8; len]; | ||
let mut wr = Cursor::new(&mut buf[..]); | ||
serialize_into(&mut wr, &(self.current_len.load(Ordering::Relaxed) as u64)) | ||
.map_err(Error::custom)?; | ||
|
@@ -442,6 +470,7 @@ impl<'de> Deserialize<'de> for AppendVec { | |
pub mod tests { | ||
use super::test_utils::*; | ||
use super::*; | ||
use assert_matches::assert_matches; | ||
use log::*; | ||
use rand::{thread_rng, Rng}; | ||
use solana_sdk::timing::duration_as_ms; | ||
|
@@ -523,4 +552,145 @@ pub mod tests { | |
AppendVec::get_relative_path(full_path).unwrap() | ||
); | ||
} | ||
|
||
#[test] | ||
fn test_set_file_empty_data() { | ||
let file = get_append_vec_path("test_set_file_empty_data"); | ||
let path = &file.path; | ||
let mut av = AppendVec::new(&path, true, 1024 * 1024); | ||
|
||
assert_eq!(av.accounts(0).len(), 0); | ||
|
||
let result = av.set_file(path); | ||
assert_matches!(result, Ok(_)); | ||
} | ||
|
||
#[test] | ||
fn test_set_file_crafted_data_len() { | ||
let file = get_append_vec_path("test_set_file_crafted_data_len"); | ||
let path = &file.path; | ||
let mut av = AppendVec::new(&path, true, 1024 * 1024); | ||
|
||
let crafted_data_len = 1; | ||
|
||
av.append_account_test(&create_test_account(10)).unwrap(); | ||
|
||
let accounts = av.accounts(0); | ||
let account = accounts.first().unwrap(); | ||
{ | ||
let data_len: &u64 = &account.meta.data_len; | ||
#[allow(mutable_transmutes)] | ||
// UNSAFE: cast away & (= const ref) to &mut to force to mutate append-only (=read-only) AppendVec | ||
let data_len: &mut u64 = unsafe { &mut *(data_len as *const u64 as *mut u64) }; | ||
*data_len = crafted_data_len; | ||
} | ||
assert_eq!(account.meta.data_len, crafted_data_len); | ||
|
||
// Reload accoutns and observe crafted_data_len | ||
let accounts = av.accounts(0); | ||
let account = accounts.first().unwrap(); | ||
assert_eq!(account.meta.data_len, crafted_data_len); | ||
|
||
av.flush().unwrap(); | ||
let result = av.set_file(path); | ||
assert_matches!(result, Err(ref message) if message.to_string() == *"incorrect layout/length"); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better assertion could be possible... |
||
} | ||
|
||
#[test] | ||
fn test_set_file_too_large_data_len() { | ||
let file = get_append_vec_path("test_set_file_too_large_data_len"); | ||
let path = &file.path; | ||
let mut av = AppendVec::new(&path, true, 1024 * 1024); | ||
|
||
let too_large_data_len = u64::max_value(); | ||
av.append_account_test(&create_test_account(10)).unwrap(); | ||
|
||
let accounts = av.accounts(0); | ||
let account = accounts.first().unwrap(); | ||
|
||
{ | ||
let data_len: &u64 = &account.meta.data_len; | ||
#[allow(mutable_transmutes)] | ||
// UNSAFE: cast away &(= const ref) to &mut to force to mutate append-only (=read-only) AppendVec | ||
let data_len: &mut u64 = unsafe { &mut *(data_len as *const u64 as *mut u64) }; | ||
*data_len = too_large_data_len; | ||
} | ||
|
||
assert_eq!(account.meta.data_len, too_large_data_len); | ||
let accounts = av.accounts(0); | ||
assert_matches!(accounts.first(), None); | ||
|
||
av.flush().unwrap(); | ||
let result = av.set_file(path); | ||
assert_matches!(result, Err(ref message) if message.to_string() == *"incorrect layout/length"); | ||
} | ||
|
||
#[test] | ||
fn test_set_file_crafted_executable() { | ||
let file = get_append_vec_path("test_set_file_crafted_executable"); | ||
let path = &file.path; | ||
let mut av = AppendVec::new(&path, true, 1024 * 1024); | ||
av.append_account_test(&create_test_account(10)).unwrap(); | ||
{ | ||
let mut executable_account = create_test_account(10); | ||
executable_account.1.executable = true; | ||
av.append_account_test(&executable_account).unwrap(); | ||
} | ||
|
||
// reload accounts | ||
let accounts = av.accounts(0); | ||
let account = &accounts[0]; | ||
|
||
// ensure false is 0u8 and true is 1u8 actually | ||
{ | ||
let executable_bool: &bool = &account.account_meta.executable; | ||
// UNSAFE: Force to interpret mmap-backed bool as u8 to really read the actual memory content | ||
let executable_byte: &u8 = unsafe { &*(executable_bool as *const bool as *const u8) }; | ||
assert_eq!(*executable_byte, 0); | ||
|
||
let executable_bool: &bool = &accounts[1].account_meta.executable; | ||
// UNSAFE: Force to interpret mmap-backed bool as u8 to really read the actual memory content | ||
let executable_byte: &u8 = unsafe { &*(executable_bool as *const bool as *const u8) }; | ||
assert_eq!(*executable_byte, 1); | ||
} | ||
|
||
let crafted_executable = u8::max_value() - 1; | ||
|
||
{ | ||
let executable_ref: &bool = &account.account_meta.executable; | ||
#[allow(mutable_transmutes)] | ||
// UNSAFE: Force to interpret mmap-backed &bool as &u8 to write some crafted value; | ||
let executable_byte: &mut u8 = | ||
unsafe { &mut *(executable_ref as *const bool as *mut u8) }; | ||
*executable_byte = crafted_executable; | ||
} | ||
|
||
// reload crafted accounts | ||
let accounts = av.accounts(0); | ||
let account = accounts.first().unwrap(); | ||
|
||
// we can observe crafted value by ref | ||
{ | ||
let executable_bool: &bool = &account.account_meta.executable; | ||
// we can not use assert_eq!... | ||
// *executable_bool is true but its actual memory value is crafted_executable, not 1 | ||
assert!(*executable_bool != true); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. dark side of unsafe (part 1) xD |
||
// UNSAFE: Force to interpret mmap-backed bool as u8 to really read the actual memory content | ||
let executable_byte: &u8 = unsafe { &*(executable_bool as *const bool as *const u8) }; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this unsafe block/casting is repeated in the tests a few times, can we have a function that is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I was a bit annoyed the repeated |
||
assert_eq!(*executable_byte, crafted_executable); | ||
} | ||
|
||
// we can NOT observe crafted value by value | ||
{ | ||
let executable_bool: bool = account.account_meta.executable; | ||
assert_eq!(executable_bool, false); | ||
// UNSAFE: Force to interpret mmap-backed bool as u8 to really read the actual memory content | ||
let executable_byte: u8 = unsafe { std::mem::transmute::<bool, u8>(executable_bool) }; | ||
assert_eq!(executable_byte, 0); // Wow, not crafted_executable! | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. dark side of unsafe (part 2) xD |
||
} | ||
|
||
av.flush().unwrap(); | ||
let result = av.set_file(path); | ||
assert_matches!(result, Err(ref message) if message.to_string() == *"incorrect layout/length"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fairly certain
64 byte offset
is wrong description; it should be8 byte offset
or64 bit offset
if you preferbits
. Padding at 64 byte boundary would be too wasteful. I've never heard of such architecture. Also, the macro impl doesn't look like actualy aligning with 64 byte, too.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea 64-byte in the description is wrong, but some vector instructions like vmovapd can require 64-byte alignment for avx-512 moves:
https://www.felixcloutier.com/x86/movapd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course compilers will probably always emit the unaligned-tolerant versions of those instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, the mighty 512 bits! Yeah, 64-byte alignment will be warranted in some special cases! Thanks for the tip!