-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/worker/common: Refresh node descriptors mid-epoch #2584
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2584 +/- ##
=========================================
- Coverage 63.47% 63.38% -0.1%
=========================================
Files 357 359 +2
Lines 33682 33841 +159
=========================================
+ Hits 21381 21449 +68
- Misses 9693 9764 +71
- Partials 2608 2628 +20
Continue to review full report at Codecov.
|
7f3f147
to
c607f08
Compare
f11ae03
to
08bfcc4
Compare
08bfcc4
to
d52b3da
Compare
Previously node descriptors were only refreshed on an epoch transition which meant that any later updates were ignored. This caused stale RAKs to stay in effect when runtime restarts happened. Enabling mid-epoch refresh also makes having more ephemeral keys easier.
Commit 2fdfd28 introduced a possibility of the storage round syncing process to deadlock when an unfortunate series of events occur in specific order. Assume the following happens, in order: 1. Round X has just completed syncing and has been applied, so lastFullyAppliedRound has been updated to X and all metadata about round X has been removed from syncingRounds and hashCache. Since the round has not yet been finalized, cachedLastRound still points to X-1. 2. A new incoming block for round X+1 appears. 3. The code checks what needs to be synced and starts with cachedLastRound which still points to round X-1 (because round X has not yet been applied). Because sync metadata of round X have been removed, it assumes that round X is not yet synced so it starts syncing it by queuing sync requests and adding new metadata to the top of the syncingRounds heap. 4. Round X is finalized, so cachedLastRound is updated to X. Since the top of the sync management loop checks if lastFullyAppliedRound + 1 is at the top of the heap this leads to a deadlock. The top item contains metadata for round X while the management loop is waiting for X+1. This commit fixes the issue by using lastFullyAppliedRound in the incoming block handler instead of cachedLastRound.
Also use new generalized interfaces for managing gRPC connections to committee members (with support for mid-epoch refresh).
d52b3da
to
cc98774
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Fixes #1794.
Fixes #2193
Previously node descriptors were only refreshed on an epoch transition which
meant that any later updates were ignored. This caused stale RAKs to stay in
effect when runtime restarts happened.
Enabling mid-epoch refresh also makes having more ephemeral keys easier.
TODO
NodeInfo
from commitment pool.