-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate peers on startup #231
Conversation
@@ -483,55 +465,6 @@ func (r *messageRelayer) createSignedMessageAppRequest(requestID uint32) (*avala | |||
return nil, errNotEnoughSignatures | |||
} | |||
|
|||
func (r *messageRelayer) getCurrentCanonicalValidatorSet() ([]*avalancheWarp.Validator, uint64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signing subnet check moved to messageRelayer
constructor; the rest of this function moved to CanonicalValidatorClient.GetCurrentCanonicalValidatorSet
zap.Error(err), | ||
) | ||
return nil, err | ||
} | ||
|
||
// We make queries to node IDs, not unique validators as represented by a BLS pubkey, so we need this map to track |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic was moved to AppRequestNetwork.ConnectToCanonicalValidators
@@ -156,18 +175,6 @@ func (r *messageRelayer) createSignedMessage() (*avalancheWarp.Message, error) { | |||
) | |||
return nil, err | |||
} | |||
signingSubnetID := r.relayer.sourceSubnetID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was moved to the messageRelayer
constructor
// convenience fields to access the source subnet and chain IDs after initialization | ||
sourceSubnetIDs []ids.ID | ||
sourceBlockchainIDs []ids.ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These were only used to pass to the app request network constructor. With this change, it makes more sense to pass the configuration and iterate over the SourceBlockchain
and DestinationBlockchain
slices directly, so the parsed IDs were moved there.
@@ -103,83 +102,99 @@ func NewNetwork( | |||
return nil, nil, err | |||
} | |||
|
|||
// We need to initially connect to some nodes in the network before peer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need to connect to these beacon nodes, since we explicitly connect to every node that we send AppRequest
messages to now.
@@ -417,6 +401,18 @@ func (s *SourceBlockchain) Validate(destinationBlockchainIDs *set.Set[string]) e | |||
} | |||
} | |||
|
|||
// Validate and store the subnet and blockchain IDs for future use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this PR, but we should consider splitting config.go
up into multiple files at some point soon. I think SourceBlockchain
and DestinationBlockchain
could each be put in their own file still in the config
package and could help improve navigation/readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed and some of the similarities between the source/destination can be abstracted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some of the similarities between the source/destination can be abstracted
Viper doesn't play nicely with this unfortunately. There may be a workaround, but I haven't looked to deeply into it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look too bad, but we can definitely defer to a separate ticket https://stackoverflow.com/questions/47185318/multiple-config-files-with-go-viper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify, I think we should keep a single user-facing configuration file, but split it into multiple Go files for easier code readability.
I experimented with Matt's suggestion of composing SourceBlockchain
and DestinationBlockchain
using a common type, but Viper seemed to have trouble unmarshalling JSON into them. Definitely worth revisiting if we decide to go for those config improvements.
break | ||
// Manually connect to the validators of each of the source subnets. | ||
// We return an error if we are unable to connect to sufficient stake on any of the subnets. | ||
// Sufficient stake is determined by the Warp quora of the configured supported destinations, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL that "quora" is a word. 😅
// We return an error if we are unable to connect to sufficient stake on any of the subnets. | ||
// Sufficient stake is determined by the Warp quora of the configured supported destinations, | ||
// or if the subnet supports all destinations, by the quora of all configured destinations. | ||
for _, sourceBlockchain := range cfg.SourceBlockchains { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop body is pretty large. May want to consider moving parts out to helper functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would also be great to add unit tests for this since the logic is relatively complex. I'm not sure how difficult it is to mock the various clients and arNetwork
though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add unit tests, I believe we'd have to add mocks for the info client, the P-Chain client, and the underlying Network. All doable, but it might be a large enough change to warrant its own ticket.
peers/app_request_network.go
Outdated
logger.Error( | ||
"Failed to get subnet ID", | ||
zap.Error(err), | ||
) | ||
return nil, nil, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In an effort to reduce to code complexity a little bit, I think if we don't require and additional info in the log (ex the blockchain ID), then it would be best skip logging here and to wrap the error:
return nil, nil, fmt.Errorf("failed to get subnet ID: %w", err)
This way, all the important info will get logged together at some point up the stack. As is, we'll get a bunch of disjointed errors that will be harder to correlate.
Doesn't have to be something we do in this PR, but I think it would be good practice to go through and implement at some point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed that we can have a more unified logging / error handling strategy. I also agree that we should go through and unify the repo once we decide on that strategy. In the meantime, however, I'd like to err on the side of over-verbosity and log where we can, even if some logs are redundant.
ConnectedWeight uint64 | ||
TotalValidatorWeight uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to have at least TotalValidatorWeight
not be in the struct here, but have a receiver function on ConnectedCanonicalValidators
that will iterate through and add up all the weights. This would avoid bugs down the road if anyone mutates in the Validator set.
If we did the same for ConnectedWeight
, it would be easier to get the updated weight of all our connections. If we need to use the connected weight many times and don't want to try connecting to all the nodes every time, maybe make the field MostRecentConnectedWeight
and then add a receiver method UpdateConnectedWeight
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think those ideas make sense, but I'm going to defer that for later. I think we'll likely want to add an async routine to update the validator set, rather than fetching it in every message relay, at which point we'll want to carefully consider how the global state is managed. For now though, this type is only used in the message relayer, so the risk of it being misused is fairly low.
// Helper struct to hold connected validator information | ||
// Warp Validators sharing the same BLS key may consist of multiple nodes, | ||
// so we need to track the node ID to validator index mapping | ||
type ConnectedCanonicalValidators struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should move this struct to a separate file, and ideally have mockign tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I don't think it's worth all that special treatment since this type is POD. It's really just there to simplify calls to ConnectToCanonicalValidators
and keep associated data packaged together.
@@ -417,6 +401,18 @@ func (s *SourceBlockchain) Validate(destinationBlockchainIDs *set.Set[string]) e | |||
} | |||
} | |||
|
|||
// Validate and store the subnet and blockchain IDs for future use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look too bad, but we can definitely defer to a separate ticket https://stackoverflow.com/questions/47185318/multiple-config-files-with-go-viper
Why this should be merged
Fixes #193
How this works
Manually connects to subnet validators on startup, rather than at message relay time. Each subnet must be connected to enough stake that any of the configured destinations would be able to validate Warp messages sent by the subnet, otherwise the relayer exits. We still need to attempt to connect to peers at message relay time, since the validator set may have changed. This change is more intended to surface any issues earlier, rather than optimizing performance.
In addition to moving this check to startup, this change also refactors some of the peer connection and signature management logic into common helpers.
How this was tested
CI
How is this documented
N/A