[fix] block_waiter serve blocks with batches of similar digest ids #746

akichidis · 2022-08-11T18:04:11Z

When created the block_waiter the assumption was that each produced batch will resolve to a unique digest id. Given the current design this is not guaranteed. If someone posts the same transactions we could end up producing batches with the same batch id. That became more apparent from the tests on the reconfigure.rs while working on the task to swap the batch_loader and use the block_waiter #738 . The tests are producing transactions of the exact same payload ending producing batches of same batch ids. Consequently the block_waiter was failing to successfully respond to concurrent requests of different certificates with common batch ids leading to receiving error messages like this:

Couldn't find pending batch with id ....

which is produced when trying to send a batch reply to the aggregator which waits for a certificate's batches to be received. Since tx_pending_batch was holding only the latest request's one shot sender, we couldn't basically notify multiple aggregators for the receipt of the batch.

This PR is fixing this. Also a small optimisation is performed to make sure that only one network request will be made to fetch a batch from a worker.

Merging this PR is essential in order to unblock the #738 as the reconfigure.rs will keep failing.

huitseeker

Great fix, thanks a lot for the explicit comments, context, and tests, too!

huitseeker · 2022-08-13T00:46:57Z

primary/src/block_waiter.rs

+                    if let Err(err) = s.send(result.clone()) {
+                        error!("Couldn't send batch result {} message to channel [{:?}] for block_id {}", batch_id, err, id);


Nit: Here and in a few of the lines above you're using a lot of if let Err(err) = xxx and logging coupled.

Nothing wrong with that, but some may find it tedious to read. You might want to look at tap_err and possibly use captured identifiers ({id}) to make this a bit lighter. YMMV.

In this specific case I am only consuming the error just to print it. Using the tap_err in the end is making me (because of clippy) consume the result anyways and do a let _ = .... . Maybe the tap_err would be much better fit if I wanted to do something further with the result it self. That being said though, I like the proposal of introducing tap as I can definitely see cases where we can use it - thanks for the recommendation! I've applied the recommendation.

…ystenLabs#746)

)

…ystenLabs/narwhal#746)

akichidis requested review from huitseeker, asonnino and arun-koshy August 11, 2022 18:04

akichidis mentioned this pull request Aug 11, 2022

[refactor] use block_waiter instead of batch_loader #738

Merged

asonnino approved these changes Aug 11, 2022

View reviewed changes

huitseeker approved these changes Aug 13, 2022

View reviewed changes

akichidis added 6 commits August 15, 2022 14:46

add map as value in pending batches map

2c4b1e8

sender batch results in multiple channels

b35f893

extend test to include case with header with common block ids

da2efe7

fix format clippy

42f5c65

fix comment

0b9fea4

address review comments

de26792

akichidis force-pushed the fix-block-waiter-batches branch from 43120bb to de26792 Compare August 15, 2022 13:51

update workspace

9dfb7a3

akichidis merged commit edaab03 into main Aug 15, 2022

akichidis deleted the fix-block-waiter-batches branch August 15, 2022 16:04

huitseeker pushed a commit to huitseeker/narwhal that referenced this pull request Aug 16, 2022

[fix] block_waiter serve blocks with batches of similar digest ids (M…

3a2da22

…ystenLabs#746)

huitseeker pushed a commit that referenced this pull request Aug 16, 2022

[fix] block_waiter serve blocks with batches of similar digest ids (#746

930cd51

)

mwtian pushed a commit to mwtian/sui that referenced this pull request Sep 30, 2022

[fix] block_waiter serve blocks with batches of similar digest ids (M…

70e923b

…ystenLabs/narwhal#746)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix] block_waiter serve blocks with batches of similar digest ids #746

[fix] block_waiter serve blocks with batches of similar digest ids #746

akichidis commented Aug 11, 2022 •

edited

Loading

huitseeker left a comment

huitseeker Aug 13, 2022

akichidis Aug 15, 2022

		if let Err(err) = s.send(result.clone()) {
		error!("Couldn't send batch result {} message to channel [{:?}] for block_id {}", batch_id, err, id);

[fix] block_waiter serve blocks with batches of similar digest ids #746

[fix] block_waiter serve blocks with batches of similar digest ids #746

Conversation

akichidis commented Aug 11, 2022 • edited Loading

huitseeker left a comment

Choose a reason for hiding this comment

huitseeker Aug 13, 2022

Choose a reason for hiding this comment

akichidis Aug 15, 2022

Choose a reason for hiding this comment

akichidis commented Aug 11, 2022 •

edited

Loading