-
Notifications
You must be signed in to change notification settings - Fork 670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unblock misconfigured subnets #2679
Conversation
@@ -171,7 +171,7 @@ func (t *Transitive) Gossip(ctx context.Context) error { | |||
// nodes with a large amount of stake weight. | |||
vdrID, ok := t.ConnectedValidators.SampleValidator() | |||
if !ok { | |||
t.Ctx.Log.Error("skipping block gossip", | |||
t.Ctx.Log.Warn("skipping block gossip", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are no longer Error
s as the node can recover.
@@ -932,7 +937,7 @@ func (t *Transitive) sendQuery( | |||
|
|||
vdrIDs, err := t.Validators.Sample(t.Ctx.SubnetID, t.Params.K) | |||
if err != nil { | |||
t.Ctx.Log.Error("dropped query for block", | |||
t.Ctx.Log.Warn("dropped query for block", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are no longer Error
s as the node can recover.
@@ -2921,3 +2921,108 @@ func TestEngineApplyAcceptedFrontierInQueryFailed(t *testing.T) { | |||
|
|||
require.Equal(choices.Accepted, blk.Status()) | |||
} | |||
|
|||
func TestEngineRepollsMisconfiguredSubnet(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No action required) I was able to grok this test, but have some suggestions for increasing maintainability:
- separate out setup (either to helper function(s) or delineated with e.g. comments) from test
- use comments to highlight important parts of the test that might not be immediately obvious to the uninitiated e.g.
- create subnet that is misconfigured (missing validator?)
- check that a tx won't be committed
- fix misconfiguration
- check that new tx can be commited (what about previous tx?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added comments. I'm hoping to do a large refactor of both this code and the tests... When we do that hopefully all of this will be able to be cleaned up.
Why this should be merged
Resolves #2594. While this is an unsafe/undesirable configuration. This error has occurred enough that we should probably make it at least recoverable. Especially for users that have no control over the validator set (which can occur on permissioned subnets)
How this works
If there a blocks processing, we now trigger repolls during the periodic
Gossip
call. This gets the node out of the (generally unexpected) state where there are blocks processing but there are no outstanding polls.How this was tested