-
Notifications
You must be signed in to change notification settings - Fork 4.5k
hides implementation details of shred from its public interface #24563
Conversation
85b56ad
to
38d4518
Compare
// with a bad header size | ||
let mut bad_header_shred = shreds[0].clone(); | ||
bad_header_shred.data_header.size = (bad_header_shred.payload.len() + 1) as u16; | ||
assert!(!should_retransmit_and_persist( | ||
&bad_header_shred, | ||
Some(bank.clone()), | ||
&cache, | ||
&me_id, | ||
0, | ||
0 | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to test_sanitize_data_shred
fn erasure_mismatch(shred1: &Shred, shred2: &Shred) -> bool { | ||
shred1.coding_header.num_coding_shreds != shred2.coding_header.num_coding_shreds | ||
|| shred1.coding_header.num_data_shreds != shred2.coding_header.num_data_shreds | ||
|| shred1.first_coding_index() != shred2.first_coding_index() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to Shred::erasure_mismatch
if shred.data_header.size == 0 { | ||
let leader_pubkey = leader_schedule | ||
.and_then(|leader_schedule| leader_schedule.slot_leader_at(slot, None)); | ||
|
||
datapoint_error!( | ||
"blockstore_error", | ||
( | ||
"error", | ||
format!( | ||
"Leader {:?}, slot {}: received index {} is empty", | ||
leader_pubkey, slot, shred_index, | ||
), | ||
String | ||
) | ||
); | ||
return false; | ||
} | ||
if shred.payload.len() > SHRED_PAYLOAD_SIZE { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shred::sanitize
will check for both these cases.
// Corrupt shred by making it too large | ||
let mut shred5 = shreds[5].clone(); | ||
shred5.payload.push(10); | ||
shred5.data_header.size = shred5.payload.len() as u16; | ||
assert!(!blockstore.should_insert_data_shred( | ||
&shred5, | ||
&slot_meta, | ||
&HashMap::new(), | ||
&last_root, | ||
None, | ||
ShredSource::Turbine | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to test_sanitize_data_shred
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are still useful test cases in case somebody accidentally removes/moves the call to sanitize() in the should_insert_data_shred
function.
Same as my comment here: #24563 (comment), I think if we add test only modifiers to the shred, we could keep these test cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think it is better if we move in the other direction; i.e.
- Remove all these "test-only" shred methods which can invalidate a shred (
set_slot
,set_index
,set_last_in_slot
,unset_data_complete
, ...):
https://github.com/solana-labs/solana/blob/92ad76773/ledger/src/shred.rs#L577-L597
https://github.com/solana-labs/solana/blob/92ad76773/ledger/src/shred.rs#L639-L663 - Ensure that all methods which construct or modify a shred do sanitize shred at the end and can only return a sanitized shred (
new_from_serialized_shred
,new_from_data
, ...).- and, add test coverage that each of these methods correctly returns
Err
if invoked with invalid arguments. new_from_serialized_shred
already does sanitize before return:
https://github.com/solana-labs/solana/blob/92ad76773/ledger/src/shred.rs#L357-L360
- and, add test coverage that each of these methods correctly returns
- Remove the calls to
shred.sanitize()
from anywhere else outside ofledger::shred
module.Shred::sanitize
should in fact be a private function to onlyShred
itself.
(1) and (2) guarantee that if you have a shred it is already sanitized and no need to worry about that anymore. So any call to Shred::sanitize
outside of shred.rs
is superfluous.
Blockstore, window-service, etc then can only worry about checking shreds properties which do not invalidate shreds by themselves; e.g. if the shred is after the latest root or not, or if it has the right shred-version, etc.
The downside of allowing invalid shreds to fly around but then scattering shred.sanitize()
all over the code-base is that:
- It is very easy to forget
shred.sanitize()
on some necessary call-path if invalid shreds are a possibility everywhere. - We will end up calling
shred.sanitize()
(or repeating the same checks as inshred.sanitize()
) on the same shred several times during its processing lifetime, which is not free; e.g.- once in the window service when shreds are de-serialized from the packets payload.
- then again in
should_retransmit_and_persist
:
https://github.com/solana-labs/solana/blob/38bdb401a/core/src/window_service.rs#L193-L198 - then again in
should_insert_data_shred
in blockstore:
https://github.com/solana-labs/solana/blob/38bdb401a/ledger/src/blockstore.rs#L1446-L1479 - ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, that makes sense
empty_shred.data_header.size = 0; | ||
assert!(!blockstore.should_insert_data_shred( | ||
&empty_shred, | ||
&slot_meta, | ||
&HashMap::new(), | ||
&last_root, | ||
None, | ||
ShredSource::Recovered, | ||
)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested in test_sanitize_data_shred
// Trying to insert shred with num_coding == 0 should fail | ||
{ | ||
let mut coding_shred = Shred::new_empty_from_header( | ||
shred.clone(), | ||
DataShredHeader::default(), | ||
coding.clone(), | ||
); | ||
coding_shred.coding_header.num_coding_shreds = 0; | ||
assert!(!Blockstore::should_insert_coding_shred( | ||
&coding_shred, | ||
&last_root | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested in test_sanitize_coding_shred
// Trying to insert shred with pos >= num_coding should fail | ||
{ | ||
let mut coding_shred = Shred::new_empty_from_header( | ||
shred.clone(), | ||
DataShredHeader::default(), | ||
coding.clone(), | ||
); | ||
let num_coding_shreds = coding_shred.index() - coding_shred.fec_set_index(); | ||
coding_shred.coding_header.num_coding_shreds = num_coding_shreds as u16; | ||
assert!(!Blockstore::should_insert_coding_shred( | ||
&coding_shred, | ||
&last_root | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested in test_sanitize_coding_shred
{ | ||
let mut coding_shred = Shred::new_empty_from_header( | ||
shred.clone(), | ||
DataShredHeader::default(), | ||
coding.clone(), | ||
); | ||
coding_shred.common_header.fec_set_index = std::u32::MAX - 1; | ||
coding_shred.coding_header.num_data_shreds = 2; | ||
coding_shred.coding_header.num_coding_shreds = 4; | ||
coding_shred.coding_header.position = 1; | ||
coding_shred.common_header.index = std::u32::MAX - 1; | ||
assert!(!Blockstore::should_insert_coding_shred( | ||
&coding_shred, | ||
&last_root | ||
)); | ||
|
||
coding_shred.coding_header.num_coding_shreds = 2000; | ||
assert!(!Blockstore::should_insert_coding_shred( | ||
&coding_shred, | ||
&last_root | ||
)); | ||
|
||
// Decreasing the number of num_coding_shreds will put it within the allowed limit | ||
coding_shred.coding_header.num_coding_shreds = 2; | ||
assert!(Blockstore::should_insert_coding_shred( | ||
&coding_shred, | ||
&last_root | ||
)); | ||
|
||
// Insertion should succeed | ||
blockstore | ||
.insert_shreds(vec![coding_shred], None, false) | ||
.unwrap(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested in test_sanitize_coding_shred
#[test] | ||
fn test_large_num_coding() { | ||
solana_logger::setup(); | ||
let slot = 1; | ||
let (_data_shreds, mut coding_shreds, leader_schedule_cache) = | ||
setup_erasure_shreds(slot, 0, 100); | ||
|
||
let ledger_path = get_tmp_ledger_path_auto_delete!(); | ||
let blockstore = Blockstore::open(ledger_path.path()).unwrap(); | ||
|
||
coding_shreds[1].coding_header.num_coding_shreds = u16::MAX; | ||
blockstore | ||
.insert_shreds( | ||
vec![coding_shreds[1].clone()], | ||
Some(&leader_schedule_cache), | ||
false, | ||
) | ||
.unwrap(); | ||
|
||
// Check no coding shreds are inserted | ||
let res = blockstore.get_coding_shreds_for_slot(slot, 0).unwrap(); | ||
assert!(res.is_empty()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested in test_sanitize_coding_shred
38d4518
to
cc4d9af
Compare
return false; | ||
} | ||
if shred.payload.len() > SHRED_PAYLOAD_SIZE { | ||
if !shred.sanitize() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to check the sanity of shreds that have already been inserted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not missing something, on current master code, same sanity checks are performed on shreds before inserting them into blockstore:
new_from_serialized_shred
sanitizes shreds obtained from payload:
https://github.com/solana-labs/solana/blob/38bdb401a/ledger/src/shred.rs#L365should_retransmit_and_persist
validates index and payload length:
https://github.com/solana-labs/solana/blob/38bdb401a/core/src/window_service.rs#L193-L198should_insert_data_shred
checks forsize == 0
and payload length:
https://github.com/solana-labs/solana/blob/38bdb401a/ledger/src/blockstore.rs#L1446-L1479
This commit brings all those checks in one place in Shred::sanitize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my mistake, this looks good
// size of nonce: 4 | ||
// size of common shred header: 83 | ||
// size of coding shred header: 6 | ||
const VALID_SHRED_DATA_LEN: usize = PACKET_DATA_SIZE - 4 - 83 - 6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to pull out constants into a separate file named something like "shred_private_consts.rs" to indicate that it isn't intended for general use, but could be used for benchmarks/tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am expecting there will be different variants of shred struct where such constants are no longer applicable or the same across variants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's fine, we can see what's required for the new shred variant
self.index() < MAX_DATA_SHREDS_PER_SLOT as u32 | ||
&& self.parent().is_ok() | ||
&& size <= self.payload.len() | ||
&& DATA_SHRED_SIZE_RANGE.contains(&size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a comment explaining this? I'm a bit confused.
DATA_SHRED_SIZE_RANGE
is defined as SHRED_DATA_OFFSET..=SHRED_DATA_OFFSET + SIZE_OF_DATA_SHRED_PAYLOAD;
, so it's the offset at which the headers end and the data payload begins.
Why must the self.data_header.size
be in this range? Shouldn't it be in the range 0..SIZE_OF_DATA_SHRED_PAYLOAD
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because self.data_header.size
includes the size of common and data headers as well:
https://github.com/solana-labs/solana/blob/38bdb401a/ledger/src/shred.rs#L291-L293
I have included comments for both DATA_SHRED_SIZE_RANGE
and DataShredHeader.size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, got it, thanks!
// TODO: The test previously relied on corrupting shred payload | ||
// size which we no longer want to expose. Current test no longer | ||
// covers packet size check in repair_response_packet_from_bytes. | ||
shreds.remove(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test-only function in the shred to allow us to modify certain fields, so tests like this can still function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see #24563 (comment)
ledger/src/blockstore.rs
Outdated
@@ -1239,15 +1242,15 @@ impl Blockstore { | |||
let maybe_shred = self.get_coding_shred(slot, coding_index); | |||
if let Ok(Some(shred_data)) = maybe_shred { | |||
let potential_shred = Shred::new_from_serialized_shred(shred_data).unwrap(); | |||
if Self::erasure_mismatch(&potential_shred, shred) { | |||
return Some(potential_shred.payload); | |||
if shred.erasure_mismatch(&potential_shred).unwrap_or_default() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might just be able to hard unwrap() here, we should never enter this code path with non coding shreds
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to unwrap
Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.
cc4d9af
to
bc29204
Compare
…na-labs#24563) Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.
…na-labs#24563) Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.
…na-labs#24563) Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.
…na-labs#24563) Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.
…na-labs#24563) Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.
Problem
Working towards embedding versioning into shreds binary, so that a new
variant of shred struct can include merkle tree hashes of the erasure
set.
Summary of Changes
removed implementation details from public interface