-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strictly sanitize mmapped AppendVec file contents #7464
Conversation
@@ -389,8 +417,8 @@ impl Serialize for AppendVec { | |||
S: serde::ser::Serializer, | |||
{ | |||
use serde::ser::Error; | |||
let len = std::mem::size_of::<usize>() as u64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These casts are odd...
|
||
if !self.sanitize_layout_and_length() { | ||
return Err(std::io::Error::new( | ||
std::io::ErrorKind::Other, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know using those Errors is a bit off...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea.. I would prefer using either a custom Result type or maybe even something like io::Result::InvalidInput
https://doc.rust-lang.org/std/io/enum.ErrorKind.html#variant.InvalidInput
// Yes, this really hannpens; see test_set_file_crafted_executable | ||
let executable_bool: &bool = &self.account_meta.executable; | ||
// UNSAFE: Force to interpret mmap-backed bool as u8 to ensure higher 7-bits are cleared correctly. | ||
let executable_byte: &u8 = unsafe { &*(executable_bool as *const bool as *const u8) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unsafe is in production code path. But risk should have been minimized; it only reads a byte of memory with narrowest scoping.
@@ -13,11 +13,12 @@ use std::{ | |||
sync::Mutex, | |||
}; | |||
|
|||
//Data is aligned at the next 64 byte offset. Without alignment loading the memory may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fairly certain 64 byte offset
is wrong description; it should be 8 byte offset
or 64 bit offset
if you prefer bits
. Padding at 64 byte boundary would be too wasteful. I've never heard of such architecture. Also, the macro impl doesn't look like actualy aligning with 64 byte, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea 64-byte in the description is wrong, but some vector instructions like vmovapd can require 64-byte alignment for avx-512 moves:
https://www.felixcloutier.com/x86/movapd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course compilers will probably always emit the unaligned-tolerant versions of those instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avx-512 moves
Oh, the mighty 512 bits! Yeah, 64-byte alignment will be warranted in some special cases! Thanks for the tip!
@@ -187,17 +199,39 @@ impl AppendVec { | |||
|
|||
let map = unsafe { MmapMut::map_mut(&data)? }; | |||
self.map = map; | |||
|
|||
if !self.sanitize_layout_and_length() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This adds additional sanitization costs for the snapshot ingestion codepath. Its impact on the overall validator performance should be minimal because it's only done only once when starting a validator from snapshot.
This PR intentionally didn't added these checks for the actual AppendVec write codepath for the performance concerns and its dubious merits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this PR didn't add these check for snapshot generation code path as well with the same reason.
return None; | ||
} | ||
let data = &self.map[offset..offset + size]; | ||
//Data is aligned at the next 64 byte offset. Without alignment loading the memory may |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, these comments are redundant at best; so removed them.
|
||
av.flush().unwrap(); | ||
let result = av.set_file(path); | ||
assert_matches!(result, Err(ref message) if message.to_string() == *"incorrect layout/length"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better assertion could be possible...
Codecov Report
@@ Coverage Diff @@
## master #7464 +/- ##
========================================
- Coverage 80.7% 70.8% -9.9%
========================================
Files 244 245 +1
Lines 48682 55276 +6594
========================================
- Hits 39291 39170 -121
- Misses 9391 16106 +6715 |
runtime/src/append_vec.rs
Outdated
let executable_bool: &bool = &account.account_meta.executable; | ||
// we can not use assert_eq!... | ||
// *executable_bool is true but its actual memory value is crafted_executable, not 1 | ||
assert!(*executable_bool != true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dark side of unsafe (part 1) xD
runtime/src/append_vec.rs
Outdated
assert_eq!(executable_bool, false); | ||
// UNSAFE: Force to interpret mmap-backed bool as u8 to really read the actual memory content | ||
let executable_byte: u8 = unsafe { std::mem::transmute::<bool, u8>(executable_bool) }; | ||
assert_eq!(executable_byte, 0); // Wow, not crafted_executable! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dark side of unsafe (part 2) xD
runtime/src/append_vec.rs
Outdated
// *executable_bool is true but its actual memory value is crafted_executable, not 1 | ||
assert!(*executable_bool != true); | ||
// UNSAFE: Force to interpret mmap-backed bool as u8 to really read the actual memory content | ||
let executable_byte: &u8 = unsafe { &*(executable_bool as *const bool as *const u8) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this unsafe block/casting is repeated in the tests a few times, can we have a function that is assert_eq_bool(ptr, expected_bool_value);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was a bit annoyed the repeated unsafe
s... Thanks for suggestion! I've done the cleanup differentially, though. How does that look for you?: 6d62daa
failures: ---- append_vec::tests::test_set_file_crafted_executable stdout ---- thread 'append_vec::tests::test_set_file_crafted_executable' panicked at 'assertion failed: `(left == right)` left: `true`, right: `true`', runtime/src/append_vec.rs:683:13 stack backtrace:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, @sakridge is a better reviewer for this change though so I defer approval to him 👑
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
// we can observe crafted value by ref | ||
{ | ||
let executable_bool: &bool = &account.account_meta.executable; | ||
// Depending on use, *executable_bool can be truthy or falsy due to direct memory manipulation | ||
// assert_eq! thinks *exeutable_bool is equal to false but the if condition thinks it's not, contradictly. | ||
assert_eq!(*executable_bool, false); | ||
if *executable_bool == false { | ||
panic!("This didn't occur if this test passed."); | ||
} | ||
assert_eq!(*account.ref_executable_byte(), crafted_executable); | ||
} | ||
|
||
// we can NOT observe crafted value by value | ||
{ | ||
let executable_bool: bool = account.account_meta.executable; | ||
assert_eq!(executable_bool, false); | ||
assert_eq!(account.get_executable_byte(), 0); // Wow, not crafted_executable! | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here backref: anza-xyz#1485 (comment)
Problem
Currently, It's very easy to cause DoS with crafted AppendVec data file. That's because
data_len
is directly used to allocate thedata_len
number ofu8[]
, and is used for the offset calculation without overflow check, for example.Also, I've carefully audited the fields in the
AppendVec
data file this time. Most of fields includingPubkey
,Hash
andlamports
can legally contain arbitrary values for its type domain. So there aren't much to sanitize them at theAppendVec
layer. However there are only two exceptions:data_len
andexecutable
.As mentioned before,
data_len
must be sensibleu64
for memory allocation. This is obvious and simple.And
exeutable
is a bit subtle. It'sbool
consuming 1 bit logically in Rust land, but it consumes 8 bits physically. That means the higher 7 bits are usually not touched, however we must sanitize those bits to be cleared when snapshot ingestion. Otherwise, it's undefined behavior so bogus checks forexeutable
could be possible depending on some myriad of combination of runtime configuration (rust version, compiler optimization, machine architecture, OS varieties).After all, we should be super careful; we're fearless and very rare people to dare to mmap completely untrusted (=not even semi-trusted) data directly with minimal sanitization... :p We're proudly performance-obsessed. :)
Summary of Changes
unsafe {}
s in both production and test code (mandatory due to the need to prepare malicious (=crafted) bytes and to guard against it)data_len
: Protect by the way of strict offset calculation sanitization. This PR doesn't explicitly impose limits on it; In combination with Sanitize AppendVec's file_size #7373, it'll effectively limit huge memory allocation becausedata_len
in this PR won't be greater than AppendVec'sfile_size
.executable
: Simply forbid any bad value other than0b0000_0000
and0b0000_00001
.Part of #7167