You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In MystenLabs/narwhal#738 we replaced the batch_loader with the block_waiter, so that the block_waiter is now used in its usual context (validator API) as well as in the consensus (in the executor).
There are reasons for that (the batch_loader it replaced was less reliable).
But the fundamental is the block_waiter is a component that has been built with a punctual short lived blocking request in mind (for interfacing with external consensus). Retries were assumed to be cheap there.
In practice, the block_waiter as used in the executor is used to power reliable streaming of transactions from NW consensus to Sui. It retrieves those TXes from the primary's own worker(s) (and failing that, the other workers in the network).
The unhappy path of this retrieval is different: reliability is much more valued than low latency, and retries are more expensive.
Hence it should be possible to have a different config for timeouts in NW + Sui rather than NW + external consensus (right now those timeouts are hard coded constants).
See #5293 for a related issue and a list of current timeouts.
The text was updated successfully, but these errors were encountered:
In MystenLabs/narwhal#738 we replaced the batch_loader with the block_waiter, so that the block_waiter is now used in its usual context (validator API) as well as in the consensus (in the executor).
There are reasons for that (the batch_loader it replaced was less reliable).
But the fundamental is the block_waiter is a component that has been built with a punctual short lived blocking request in mind (for interfacing with external consensus). Retries were assumed to be cheap there.
In practice, the block_waiter as used in the executor is used to power reliable streaming of transactions from NW consensus to Sui. It retrieves those TXes from the primary's own worker(s) (and failing that, the other workers in the network).
The unhappy path of this retrieval is different: reliability is much more valued than low latency, and retries are more expensive.
Hence it should be possible to have a different config for timeouts in NW + Sui rather than NW + external consensus (right now those timeouts are hard coded constants).
See #5293 for a related issue and a list of current timeouts.
The text was updated successfully, but these errors were encountered: