-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Robustness of eth2 clients to the eth1 node #1759
Comments
Chiming in on the Prysm side of things since I believe you may have experienced the above issues in our testnet.
In order to determine the genesis state, the beacon node must have access to all of the deposits involved in creating this state. Another idea is that we hardcode the genesis state into the application post-launch.
I am not sure if this was Prysm or not, but eth1 connection only affects block proposals. The beacon node should be able to continue without an eth1 connection. In Prysm, we have recently implemented a timeout of 2 seconds when requesting eth1 information during block proposals (prysmaticlabs/prysm#5583). This is to help mitigate any issues where an eth1 node is slow to respond and a block proposal must be created in a timely fashion to maximize the validator's reward. If the timeout is exceeded, a random vote is used. Thanks |
I think this was with Lighthouse but I can't find the issue anymore. There was nothing to suggest that the eth1 node going down was linked to the beacon node losing peers, apart from one happened some time after the other. I've never observed this, nor can I figure how it might happen so I closed the issue. |
pre-genesis issue |
Would like to suggest this reminder to (all) eth2 clients, to be robust on whatever the eth1 node/infrasturcture does.
Here are some recent observations without calling out clients directly:
In one testnet, new beacon nodes couldn't sync to the testnet because the provided goerli node was overloaded with all the clients it is serving.
On a different client, given a running beacon node, validator, and eth1 node, when the eth1 goerli node started having issues (such as losing peers), after some time the beacon node was not able to continue functioning, and so the validator also stopped working. It's my understanding that the validator should continue unaffected, if an eth1 node goes down. This was not the case as the goerli node took down the beacon node. [To this client's credit, the validator node never had to be restarted: when the goerli node and beacon nodes are restarted and function again, the validator node resumes nicely.]
(I recall there was a pre-launch checklist of some sort [by @djrtwo] but I haven't been able to find it again. I suggest this testing be explicitly added.)
The text was updated successfully, but these errors were encountered: