-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compensate for block processing time in DBFT timer #1959
Comments
Testing setupTo test proposed changes we've used the current master branch of Neo and the patch branch. For Golang node we've tested the master branch of neo-go against the similar patch branch. The following nodes setups were taken into consideration for both implementations:
In these setups, we're mostly interested in the delta time between accepted blocks (or milliseconds per block), which is measured as the difference between corresponding blocks timestamps. Specifically, we're looking at how close this value is to the one provided in nodes configuration. This configurable value is denoted as BenchmarksFor the single-node networks, both bench setups are showing the expected results. Proposed changes make delta time between blocks very close to 1000 ms for both C# and Golang nodes, which is about 10% closer to the target value for the C# node in both 30wrk and 1000rate setups: The results of the four-nodes network tests show the same tendency of reducing the delta time between blocks. It's worth to mention that these results are not so stable because we're restricted by the hardware. Still, we have several percent improvements for the four-nodes test with 5000 milliseconds per block target value: The picture is even better with the target value = 15000 milliseconds per block: It's also necessary to emphasize that proposed changes almost do not affect the average TPS values. |
Make block timing more accurate under load. BlockIndex is needed to distinguish between consensus-produced block and the one added via regular block addition mechanism. See neo-project/neo#1959 and neo-project/neo#1960.
Part of neo-project#1959. Port 899655a changes to master-2.x branch.
Introduction
One of the core Neo consensus characteristics is its target inter-block time distance (SecondsPerBlock/MillisecondsPerBlock). In general we want every block to be offset from the previous one by exactly the time span configured. But even if we're not considering view changes, usually it's not the case for a real network. There are various sources for additional delays, but we want to concentrate on one particular aspect here --- block processing.
How it works now
DBFT's
InitializeConsensus()
sets the timer using configured value adjusted byblock_received_time
.block_received_time
value is set inOnPersistCompleted
, so it happens after the block processing and not a lot of things happen betweenblock_receive_time
setting andChangeTimer()
invocation. I would even suggest that removing this adjustment won't affect the node in any noticeable way (backups are not adjusted, just to mention).Related work
#626 tried to use in-block timestamps as a reference, but as there are no guarantees over nodes clocks synchronization they're not really reliable, we can only use local clocks for any decisions.
#819 approached this problem very aggressively by moving
block_receive_time
right afterInitializeConsensus()
, but this created even more problems (any view change makes this timestamp waaaay off immediately, for example).Block processing and consensus
One thing we've noticed during our testing is that under load block processing time increases up to the level where it can exceed the target time between blocks. It's not very surprising, but what's more interesting is that current consensus logic doesn't respond to that in any way. If we're using 1 second as an interval and the block takes 3 seconds to persist, the timer is set for 1 second anyway, even though it would make more sense to start creating the next block immediately. Even under moderate load block processing time can be big enough to affect the time interval between blocks.
The proposal
We can move
block_received_time
setting to the earliest point in time where we definitely know that the new block exists, that is in theCheckCommits()
. Subsequent consensus initialization viaOnPersistCompleted()
will use that time as a reference the same way it currently does making an adjustment to configured timeout value.One additional check needs to be added though and that is the height check, because
OnPersistCompleted()
can also be called without the node ever going throughCheckCommits()
. So to avoid using wrong timestamp we also save the height of the block created withCheckCommits()
.I think this change is safe, backwards compatible and applicable to both Neo versions. It doesn't give us perfect block time, but it makes us a little closer to that by accounting for one particular source of delays. The more loaded the network, the more visible will be the effect of this patch.
It has some relation to TPS performance characteristics. For the case when block processing time doesn't exceed the target interval it's not noticeable, but it makes the node produce a little more of smaller blocks. For the case when block processing time exceeds the target interval it improves TPS a little by making the node work more instead of waiting for the timer.
Potential future improvements
Theoretically we can move
block_receive_time
assignment to PrepareRequest handler to try eliminating most of network delays associated with PrepareResponse and Commit collection, but that's more invasive and dangerous, thus requerying more thorough testing under different conditions (neo-resilience might probably help here).Neo Version
Where in the software does this update applies to?
The text was updated successfully, but these errors were encountered: