Fix CPU usage in small devnets #14446

nalepae · 2024-09-12T08:09:55Z

Please read commit by commit.

Avoid infinite loop (with a whole CPU usage) when looking for peer in a small subnets.

Reminder: This former PR stopped using mainnet bootnodes in devnet. When doing this, an issue appeared on small devnets: We entered in an infinite loop when looking for peers in a given subnet. This former PR fixed this issue.

However, even with this former PR, we can still enter in an infinite loop when looking for active peers (in general, not in a given subnet.)

The current PR fixes this last issue.

In term of CPU usage, we switch from that (associated CPU profiling):

to that:

(Prysm super node: Red - Prysm full node: Blue)

E2E local test: OK

nisdas · 2024-09-12T14:06:03Z

beacon-chain/p2p/discovery.go

 			continue
 		}
+
+		if searchInProgress {


why the searchInProgress boolean ? it seems like its only used for logging. Do we really need to differentiate it ?

searchInProgress is only used for logging, yes.

Do we really need to differentiate it ?
Differentiate what and what? start and continue?

If yes, we do not strictly need to differentiate start and continue.
However, we need to differentiate start and stop.
This kind of boolean is needed for that. (Maybe you have a a solution without?)

My specifications are:

If we are starting / continuing / finishing a search for active peers, I want to know it in debug

If we are starting / continuing / finishing a search for peers for a given subnets, I want to know it in debug.

Just to clarify, listenForNewNodes isn't used for searching peers on our subnets. It is a background discovery routine that we use for general peer discovery. If you want to log for a subnet search it needs to be in FindPeersInSubnet

If we are starting / continuing / finishing a search for active peers, I want to know it in debug

Is it sufficient to just know when you are still 'searching' for new peers ? You don't need the boolean for that. I don't see the value in tracking starting because it only happens once.

Just to clarify, listenForNewNodes isn't used for searching peers on our subnets. It is a background discovery routine that we use for general peer discovery. If you want to log for a subnet search it needs to be in FindPeersInSubnet

Yes, there is different logging for that. For a dedicated subnet we have:

DEBUG p2p: Searching for new peers for a subnet - success currentPeerCount=1 targetPeerCount=1 topic=/eth2/edf8b306/data_column_sidecar_101/ssz_snappy

Is it sufficient to just know when you are still 'searching' for new peers ? You don't need the boolean for that. I don't see the value in tracking starting because it only happens once.

It only happens once (in the BN lifecycle) in the best case. If, during the life of the BN, the peer counts goes bellow the threshold, then start happens again.

==> Fixed in fa4f693

nisdas · 2024-09-12T14:06:59Z

beacon-chain/p2p/service.go

@@ -43,6 +43,10 @@ var _ runtime.Service = (*Service)(nil)
 // defined below.
 var pollingPeriod = 6 * time.Second

+// When looking for new nodes, if not enough nodes are found,
+// we stop after this amount of iterations.
+var batchSize = 40_000


40000 seems like a pretty big amount, was 2000 too small previously ?

Actually I moved from 2.000 to 40.000 to avoid too frequent logging.

Fixed in 7f70bec.

* `CustodyCountFromRemotePeer`: Set happy path in the outer scope. * `FindPeersWithSubnet`: Improve logging. * `listenForNewNodes`: Avoid infinite loop in a small subnet. * Address Nishant's comment. * FIx Nishant's comment.

nalepae added 2 commits September 4, 2024 17:47

CustodyCountFromRemotePeer: Set happy path in the outer scope.

e34a97a

FindPeersWithSubnet: Improve logging.

2987b52

nalepae changed the title ~~Peerdas look for peers cpy~~ Fix CPU usage in small devnets Sep 12, 2024

nalepae added the peerDAS label Sep 12, 2024

nalepae force-pushed the peerdas-look-for-peers-cpy branch from 7e5be76 to a123ab0 Compare September 12, 2024 11:16

listenForNewNodes: Avoid infinite loop in a small subnet.

330d07f

nalepae force-pushed the peerdas-look-for-peers-cpy branch from a123ab0 to 330d07f Compare September 12, 2024 11:19

nalepae marked this pull request as ready for review September 12, 2024 11:41

nalepae requested a review from a team as a code owner September 12, 2024 11:41

nalepae requested review from terencechain, nisdas and james-prysm and removed request for a team September 12, 2024 11:41

nisdas reviewed Sep 12, 2024

View reviewed changes

nalepae added 2 commits September 12, 2024 17:17

Address Nishant's comment.

7f70bec

FIx Nishant's comment.

fa4f693

nisdas approved these changes Sep 13, 2024

View reviewed changes

nalepae merged commit 56df2d0 into peerDAS Sep 13, 2024
13 of 16 checks passed

nalepae deleted the peerdas-look-for-peers-cpy branch September 13, 2024 10:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix CPU usage in small devnets #14446

Fix CPU usage in small devnets #14446

nalepae commented Sep 12, 2024 •

edited

Loading

nisdas Sep 12, 2024

nalepae Sep 12, 2024

nalepae Sep 12, 2024

nisdas Sep 13, 2024

nalepae Sep 13, 2024

nisdas Sep 12, 2024

nalepae Sep 12, 2024

Fix CPU usage in small devnets #14446

Fix CPU usage in small devnets #14446

Conversation

nalepae commented Sep 12, 2024 • edited Loading

nisdas Sep 12, 2024

Choose a reason for hiding this comment

nalepae Sep 12, 2024

Choose a reason for hiding this comment

nalepae Sep 12, 2024

Choose a reason for hiding this comment

nisdas Sep 13, 2024

Choose a reason for hiding this comment

nalepae Sep 13, 2024

Choose a reason for hiding this comment

nisdas Sep 12, 2024

Choose a reason for hiding this comment

nalepae Sep 12, 2024

Choose a reason for hiding this comment

nalepae commented Sep 12, 2024 •

edited

Loading