Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.3.0: Bad Request: Block not in forkChoice - Error producing SyncCommitteeContribution #5063

Closed
philknows opened this issue Jan 26, 2023 · 6 comments
Assignees
Labels
prio-high Resolve issues as soon as possible. scope-profitability Issues to directly improve validator performance and its profitability.
Milestone

Comments

@philknows
Copy link
Member

philknows commented Jan 26, 2023

Describe the bug

During sync committee duties, there are times when blocks are not in forkChoice in time to produce SyncCommitteeContributions. There doesn't seem to be any correlation with time and if it gets better over time. It is very sporadic when it happens.

Jan-26 18:31:43.150[]                error: Error on SyncCommitteeContribution slot=5661156, index=1 Bad Request: Block not in forkChoice - Error producing SyncCommitteeContribution
Error: Bad Request: Block not in forkChoice - Error producing SyncCommitteeContribution
    at HttpClient.requestWithBody (file:///usr/app/packages/api/src/utils/client/httpClient.ts:258:15)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)

Expected behavior

We should hit SyncCommitteeContribution every time. Blocks not in forkChoice should not be happening.

@philknows philknows added prio-high Resolve issues as soon as possible. scope-profitability Issues to directly improve validator performance and its profitability. labels Jan 26, 2023
@twoeths
Copy link
Contributor

twoeths commented Feb 15, 2023

could be the same root cause to #5098 (comment)

when there are multiple beacon node urls, SyncComittee block root is produced and published to one node while SyncCommitteeContribution is produced from another node, and the other node hasn't seen the block yet

@philknows
Copy link
Member Author

For greater context, restarting the VC fixed this problem, but unsure of what actually caused it. Could it have been using or receiving old duties? I don't have the exact steps to reproduce this, but there was a manual restart on the BN/EL server which the VC was trying to consistently connect to it during that time. More of this in the private discord thread here: https://discord.com/channels/593655374469660673/1068138020127125505/1068245669468454912

@twoeths
Copy link
Contributor

twoeths commented Feb 16, 2023

to be clear, this is just a log on validator with SyncCommitteeContribution and not related to the miss of SyncCommitteeSignature. This should rarely happen (unlike attestation) because it's most likely that beacon block should come at 8s after the slot. And this happens randomly per slot, restarting the validator does not help.

@twoeths twoeths self-assigned this Feb 16, 2023
@twoeths twoeths added this to the v1.5.0 milestone Feb 16, 2023
@philknows
Copy link
Member Author

It was a log that showed and was confirmed to be missing via beaconcha.in. So during this period of time, we missed all sync committee contributions until we restarted the VC.

@twoeths
Copy link
Contributor

twoeths commented Feb 20, 2023

Just looked at the incident on Jan 25/Jan 26:

  • It was caused by bad performance of beacon nodes
  • Also the rescue/fallback node was configured with subscribe_all_subnets=false at that time so we lacked some subnet peers on this node ( we only configured that to true after the incident): if submitPoolSyncCommitteeSignatures failed on main bn it could be failed in fallback/rescue node too since we lack subnet peers. Below metric was from the rescue/fallback node

Screen Shot 2023-02-20 at 10 31 33

The log in this issue is only related to SyncCommitteeContribution (not SyncCommitteeMessage), although it's not the root cause of missed SyncCommitteeMessage it could improve the issue: vc as a validator can aggregate SyncCommitteeMessages and publish to the network itself

I'm not sure how restaring vc fixed the issue but the rescue node was switched to subscribe_all_subnets=true around that time too

@twoeths
Copy link
Contributor

twoeths commented Feb 21, 2023

"Block not in forkChoice" error should be fixed with #5157

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prio-high Resolve issues as soon as possible. scope-profitability Issues to directly improve validator performance and its profitability.
Projects
None yet
Development

No branches or pull requests

3 participants