hides implementation details of shred from its public interface #24563

behzadnouri · 2022-04-21T20:30:09Z

Problem

Working towards embedding versioning into shreds binary, so that a new
variant of shred struct can include merkle tree hashes of the erasure
set.

Summary of Changes

removed implementation details from public interface

behzadnouri · 2022-04-22T17:14:02Z

core/src/window_service.rs

-        // with a bad header size
-        let mut bad_header_shred = shreds[0].clone();
-        bad_header_shred.data_header.size = (bad_header_shred.payload.len() + 1) as u16;
-        assert!(!should_retransmit_and_persist(
-            &bad_header_shred,
-            Some(bank.clone()),
-            &cache,
-            &me_id,
-            0,
-            0
-        ));


moved to test_sanitize_data_shred

behzadnouri · 2022-04-22T17:15:07Z

ledger/src/blockstore.rs

-    fn erasure_mismatch(shred1: &Shred, shred2: &Shred) -> bool {
-        shred1.coding_header.num_coding_shreds != shred2.coding_header.num_coding_shreds
-            || shred1.coding_header.num_data_shreds != shred2.coding_header.num_data_shreds
-            || shred1.first_coding_index() != shred2.first_coding_index()
-    }


moved to Shred::erasure_mismatch

behzadnouri · 2022-04-22T17:15:57Z

ledger/src/blockstore.rs

-        if shred.data_header.size == 0 {
-            let leader_pubkey = leader_schedule
-                .and_then(|leader_schedule| leader_schedule.slot_leader_at(slot, None));
-
-            datapoint_error!(
-                "blockstore_error",
-                (
-                    "error",
-                    format!(
-                        "Leader {:?}, slot {}: received index {} is empty",
-                        leader_pubkey, slot, shred_index,
-                    ),
-                    String
-                )
-            );
-            return false;
-        }
-        if shred.payload.len() > SHRED_PAYLOAD_SIZE {


Shred::sanitize will check for both these cases.

behzadnouri · 2022-04-22T17:16:47Z

ledger/src/blockstore.rs

-        // Corrupt shred by making it too large
-        let mut shred5 = shreds[5].clone();
-        shred5.payload.push(10);
-        shred5.data_header.size = shred5.payload.len() as u16;
-        assert!(!blockstore.should_insert_data_shred(
-            &shred5,
-            &slot_meta,
-            &HashMap::new(),
-            &last_root,
-            None,
-            ShredSource::Turbine
-        ));


moved to test_sanitize_data_shred

I think these are still useful test cases in case somebody accidentally removes/moves the call to sanitize() in the should_insert_data_shred function.

Same as my comment here: #24563 (comment), I think if we add test only modifiers to the shred, we could keep these test cases

I actually think it is better if we move in the other direction; i.e.

Remove all these "test-only" shred methods which can invalidate a shred (set_slot, set_index, set_last_in_slot, unset_data_complete, ...):
https://github.com/solana-labs/solana/blob/92ad76773/ledger/src/shred.rs#L577-L597
https://github.com/solana-labs/solana/blob/92ad76773/ledger/src/shred.rs#L639-L663

Ensure that all methods which construct or modify a shred do sanitize shred at the end and can only return a sanitized shred (new_from_serialized_shred, new_from_data, ...).

and, add test coverage that each of these methods correctly returns Err if invoked with invalid arguments.

new_from_serialized_shred already does sanitize before return:
https://github.com/solana-labs/solana/blob/92ad76773/ledger/src/shred.rs#L357-L360

Remove the calls to shred.sanitize() from anywhere else outside of ledger::shred module.

Shred::sanitize should in fact be a private function to only Shred itself.

(1) and (2) guarantee that if you have a shred it is already sanitized and no need to worry about that anymore. So any call to Shred::sanitize outside of shred.rs is superfluous.

Blockstore, window-service, etc then can only worry about checking shreds properties which do not invalidate shreds by themselves; e.g. if the shred is after the latest root or not, or if it has the right shred-version, etc.

The downside of allowing invalid shreds to fly around but then scattering shred.sanitize() all over the code-base is that:

It is very easy to forget shred.sanitize() on some necessary call-path if invalid shreds are a possibility everywhere.

We will end up calling shred.sanitize() (or repeating the same checks as in shred.sanitize()) on the same shred several times during its processing lifetime, which is not free; e.g.

once in the window service when shreds are de-serialized from the packets payload.

then again in should_retransmit_and_persist:
https://github.com/solana-labs/solana/blob/38bdb401a/core/src/window_service.rs#L193-L198

then again in should_insert_data_shred in blockstore:
https://github.com/solana-labs/solana/blob/38bdb401a/ledger/src/blockstore.rs#L1446-L1479

...

cool, that makes sense

behzadnouri · 2022-04-22T17:17:46Z

ledger/src/blockstore.rs

-        empty_shred.data_header.size = 0;
-        assert!(!blockstore.should_insert_data_shred(
-            &empty_shred,
-            &slot_meta,
-            &HashMap::new(),
-            &last_root,
-            None,
-            ShredSource::Recovered,
-        ));


tested in test_sanitize_data_shred

behzadnouri · 2022-04-22T17:18:22Z

ledger/src/blockstore.rs

-        // Trying to insert shred with num_coding == 0 should fail
-        {
-            let mut coding_shred = Shred::new_empty_from_header(
-                shred.clone(),
-                DataShredHeader::default(),
-                coding.clone(),
-            );
-            coding_shred.coding_header.num_coding_shreds = 0;
-            assert!(!Blockstore::should_insert_coding_shred(
-                &coding_shred,
-                &last_root
-            ));
-        }


tested in test_sanitize_coding_shred

behzadnouri · 2022-04-22T17:18:33Z

ledger/src/blockstore.rs

-        // Trying to insert shred with pos >= num_coding should fail
-        {
-            let mut coding_shred = Shred::new_empty_from_header(
-                shred.clone(),
-                DataShredHeader::default(),
-                coding.clone(),
-            );
-            let num_coding_shreds = coding_shred.index() - coding_shred.fec_set_index();
-            coding_shred.coding_header.num_coding_shreds = num_coding_shreds as u16;
-            assert!(!Blockstore::should_insert_coding_shred(
-                &coding_shred,
-                &last_root
-            ));
-        }


tested in test_sanitize_coding_shred

behzadnouri · 2022-04-22T17:18:59Z

ledger/src/blockstore.rs

-        {
-            let mut coding_shred = Shred::new_empty_from_header(
-                shred.clone(),
-                DataShredHeader::default(),
-                coding.clone(),
-            );
-            coding_shred.common_header.fec_set_index = std::u32::MAX - 1;
-            coding_shred.coding_header.num_data_shreds = 2;
-            coding_shred.coding_header.num_coding_shreds = 4;
-            coding_shred.coding_header.position = 1;
-            coding_shred.common_header.index = std::u32::MAX - 1;
-            assert!(!Blockstore::should_insert_coding_shred(
-                &coding_shred,
-                &last_root
-            ));
-
-            coding_shred.coding_header.num_coding_shreds = 2000;
-            assert!(!Blockstore::should_insert_coding_shred(
-                &coding_shred,
-                &last_root
-            ));
-
-            // Decreasing the number of num_coding_shreds will put it within the allowed limit
-            coding_shred.coding_header.num_coding_shreds = 2;
-            assert!(Blockstore::should_insert_coding_shred(
-                &coding_shred,
-                &last_root
-            ));
-
-            // Insertion should succeed
-            blockstore
-                .insert_shreds(vec![coding_shred], None, false)
-                .unwrap();
-        }


tested in test_sanitize_coding_shred

behzadnouri · 2022-04-22T17:19:26Z

ledger/src/blockstore.rs

-    #[test]
-    fn test_large_num_coding() {
-        solana_logger::setup();
-        let slot = 1;
-        let (_data_shreds, mut coding_shreds, leader_schedule_cache) =
-            setup_erasure_shreds(slot, 0, 100);
-
-        let ledger_path = get_tmp_ledger_path_auto_delete!();
-        let blockstore = Blockstore::open(ledger_path.path()).unwrap();
-
-        coding_shreds[1].coding_header.num_coding_shreds = u16::MAX;
-        blockstore
-            .insert_shreds(
-                vec![coding_shreds[1].clone()],
-                Some(&leader_schedule_cache),
-                false,
-            )
-            .unwrap();
-
-        // Check no coding shreds are inserted
-        let res = blockstore.get_coding_shreds_for_slot(slot, 0).unwrap();
-        assert!(res.is_empty());
-    }


tested in test_sanitize_coding_shred

jbiseda · 2022-04-22T17:42:47Z

ledger/src/blockstore.rs

-            return false;
-        }
-        if shred.payload.len() > SHRED_PAYLOAD_SIZE {
+        if !shred.sanitize() {


do we need to check the sanity of shreds that have already been inserted?

If I am not missing something, on current master code, same sanity checks are performed on shreds before inserting them into blockstore:

new_from_serialized_shred sanitizes shreds obtained from payload:
https://github.com/solana-labs/solana/blob/38bdb401a/ledger/src/shred.rs#L365

should_retransmit_and_persist validates index and payload length:
https://github.com/solana-labs/solana/blob/38bdb401a/core/src/window_service.rs#L193-L198

should_insert_data_shred checks for size == 0 and payload length:
https://github.com/solana-labs/solana/blob/38bdb401a/ledger/src/blockstore.rs#L1446-L1479

This commit brings all those checks in one place in Shred::sanitize.

my mistake, this looks good

jbiseda · 2022-04-22T18:52:55Z

core/benches/shredder.rs

+// size of nonce: 4
+// size of common shred header: 83
+// size of coding shred header: 6
+const VALID_SHRED_DATA_LEN: usize = PACKET_DATA_SIZE - 4 - 83 - 6;


would it make sense to pull out constants into a separate file named something like "shred_private_consts.rs" to indicate that it isn't intended for general use, but could be used for benchmarks/tests?

I am expecting there will be different variants of shred struct where such constants are no longer applicable or the same across variants.

that's fine, we can see what's required for the new shred variant

carllin · 2022-04-23T02:01:47Z

ledger/src/shred.rs

+                    self.index() < MAX_DATA_SHREDS_PER_SLOT as u32
+                        && self.parent().is_ok()
+                        && size <= self.payload.len()
+                        && DATA_SHRED_SIZE_RANGE.contains(&size)


Could we add a comment explaining this? I'm a bit confused.

DATA_SHRED_SIZE_RANGE is defined as SHRED_DATA_OFFSET..=SHRED_DATA_OFFSET + SIZE_OF_DATA_SHRED_PAYLOAD;, so it's the offset at which the headers end and the data payload begins.

Why must the self.data_header.size be in this range? Shouldn't it be in the range 0..SIZE_OF_DATA_SHRED_PAYLOAD?

Because self.data_header.size includes the size of common and data headers as well:
https://github.com/solana-labs/solana/blob/38bdb401a/ledger/src/shred.rs#L291-L293

I have included comments for both DATA_SHRED_SIZE_RANGE and DataShredHeader.size.

ah, got it, thanks!

carllin · 2022-04-23T02:07:23Z

core/src/serve_repair.rs

+            // TODO: The test previously relied on corrupting shred payload
+            // size which we no longer want to expose. Current test no longer
+            // covers packet size check in repair_response_packet_from_bytes.
+            shreds.remove(0);


Can we add a test-only function in the shred to allow us to modify certain fields, so tests like this can still function?

Please see #24563 (comment)

carllin · 2022-04-23T02:15:15Z

ledger/src/blockstore.rs

@@ -1239,15 +1242,15 @@ impl Blockstore {
            let maybe_shred = self.get_coding_shred(slot, coding_index);
            if let Ok(Some(shred_data)) = maybe_shred {
                let potential_shred = Shred::new_from_serialized_shred(shred_data).unwrap();
-                if Self::erasure_mismatch(&potential_shred, shred) {
-                    return Some(potential_shred.payload);
+                if shred.erasure_mismatch(&potential_shred).unwrap_or_default() {


might just be able to hard unwrap() here, we should never enter this code path with non coding shreds

changed to unwrap

Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.

…na-labs#24563) Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.

behzadnouri force-pushed the shred-rm-pub branch 3 times, most recently from 85b56ad to 38d4518 Compare April 22, 2022 17:11

behzadnouri commented Apr 22, 2022

View reviewed changes

behzadnouri requested review from steviez, carllin and jbiseda April 22, 2022 17:21

behzadnouri force-pushed the shred-rm-pub branch from 38d4518 to cc4d9af Compare April 22, 2022 17:24

jbiseda reviewed Apr 22, 2022

View reviewed changes

carllin reviewed Apr 23, 2022

View reviewed changes

hides implementation details of shred from its public interface

bc29204

Working towards embedding versioning into shreds binary, so that a new variant of shred struct can include merkle tree hashes of the erasure set.

behzadnouri force-pushed the shred-rm-pub branch from cc4d9af to bc29204 Compare April 23, 2022 15:03

carllin approved these changes Apr 25, 2022

View reviewed changes

behzadnouri merged commit 895f76a into solana-labs:master Apr 25, 2022

behzadnouri deleted the shred-rm-pub branch April 25, 2022 12:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hides implementation details of shred from its public interface #24563

hides implementation details of shred from its public interface #24563

behzadnouri commented Apr 21, 2022

behzadnouri Apr 22, 2022

behzadnouri Apr 22, 2022

behzadnouri Apr 22, 2022

behzadnouri Apr 22, 2022

carllin Apr 23, 2022

behzadnouri Apr 23, 2022

carllin Apr 25, 2022

behzadnouri Apr 22, 2022

behzadnouri Apr 22, 2022

behzadnouri Apr 22, 2022

behzadnouri Apr 22, 2022

behzadnouri Apr 22, 2022

jbiseda Apr 22, 2022

behzadnouri Apr 22, 2022

jbiseda Apr 22, 2022

jbiseda Apr 22, 2022

behzadnouri Apr 22, 2022

jbiseda Apr 22, 2022

carllin Apr 23, 2022

behzadnouri Apr 23, 2022 •

edited

Loading

carllin Apr 25, 2022

carllin Apr 23, 2022

behzadnouri Apr 23, 2022

carllin Apr 23, 2022

behzadnouri Apr 23, 2022

hides implementation details of shred from its public interface #24563

hides implementation details of shred from its public interface #24563

Conversation

behzadnouri commented Apr 21, 2022

Problem

Summary of Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

behzadnouri Apr 23, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

behzadnouri Apr 23, 2022 •

edited

Loading