Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Failing node synchronization #8432

Closed
Tracked by #7226
has5aan opened this issue May 6, 2023 · 1 comment
Closed
Tracked by #7226

Failing node synchronization #8432

has5aan opened this issue May 6, 2023 · 1 comment
Assignees
Milestone

Comments

@has5aan
Copy link
Contributor

has5aan commented May 6, 2023

Expected behavior

Nodes should synchronize.

Actual behavior

Nodes are unable to synchronize caused by;

  1. Another issue observed was one of the nodes throwing;
    [err=Cannot read properties of undefined (reading 'moduleStore')] Failed to generate a block.
    Perhaps moduleStore is not initialized properly within ABIHandler.initStateMachine.
    This is addressed under Node generating blocks while synchronising #8460

  2. One of the nodes throws New tip of the chain has no preference over the previous tip before synchronizing - This happens because the node when synchronizing doesn't receive peerInfo from P2P library. In a scenario, where three nodes are running and generating blocks, all unable to discover each other. Let them generate a few rounds, eventually relaunch them with fixed-peers configured, relaunching them theoretically at the same time will increase the chances of reproducing the issue.
    It is possible to land on this scenario where supposedly correct tip received from a peer is rejected, this is happening because; PeerInfo.options for the configured fixed peers received from Network.getConnectedPeers is incorrect, the concerned node at this stage is perhaps not initialized and a request over the WS is failing at this stage, however, delaying invocation of nodes allowing them to initialize all components resolves this.

    Peer tip does not have preference over current tip. This is expected behavior, as nodeInfo is received from the peers and not polled.

Steps to reproduce

Ran three nodes, with 34, 34 and 33 validators split from dev-validators.json, lets call them pos-mainchain, sync and sync2. All configured in a way so they are unable to discover each other as peer nodes. Let these three nodes ran independently for a few rounds (it was possible to reproduce the issue twice with each node having generated 40 or 70 rounds) Eventually relaunching the three nodes, with seed peers configured enabling them to sync and land on a single chain.

Which version(s) does this affect? (Environment, OS, etc...)

Lisk SDK development branch.

@has5aan
Copy link
Contributor Author

has5aan commented May 22, 2023

Peer tip does not have preference over current tip. issue.

This is expected as NodeInfo is received from the connecting node and not polled, and if it is not received, its options field is set to default value. This result in the Peer tip does not have preference over current tip. error, as the selected peer's properties height and maxHeightPrevoted are set to 0, which results in the call to BlockSynchronizationMechanism.isDifferentChain causing the error.

This is also possible if NodeInfo is already received for each connected peer, and as a peer is randomly selected an already connected and has height and maxHeightPrevoted are lower than the node syncing blocks, resulting in the the same error.

@Madhulearn Madhulearn modified the milestones: Sprint 95, Sprint 96 May 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants