Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

approval-distribution: batch assignment and approval vote send #6400

Closed
sandreim opened this issue Dec 6, 2022 · 6 comments · Fixed by #6401
Closed

approval-distribution: batch assignment and approval vote send #6400

sandreim opened this issue Dec 6, 2022 · 6 comments · Fixed by #6401
Labels
I3-bug Fails to follow expected behavior. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible.

Comments

@sandreim
Copy link
Contributor

sandreim commented Dec 6, 2022

Metrics revealed that it is possible to send messages larger than max_notification_size of the Validation protocol.

Screenshot 2022-11-30 at 14 38 11

In unify_with_peer, we send all the assignments we've seen for all candidates at a given block which just came into the view of a peer, and with aggresion mode active this is even a bigger problem.

A wire message looks like this:

pub enum ApprovalDistributionMessage {
		/// Assignments for candidates in recent, unfinalized blocks.
		///
		/// Actually checking the assignment may yield a different result.
		#[codec(index = 0)]
		Assignments(Vec<(IndirectAssignmentCert, CandidateIndex)>),
		/// Approvals for candidates in some recent, unfinalized block.
		#[codec(index = 1)]
		Approvals(Vec<IndirectSignedApprovalVote>),
	}

If a peer is behind with 1 block, we're gonna send all received assignments and approvals for all candidates up to 12th tranche.

In the case of assignments, which are larger, we can observe a wire size of 132 bytes plus encoding overhead:
32 bytes VRF output, 64 bytes VRF proof, 32 bytes block hash, 4 bytes validator index, 4 byte sample or core index
If we assume 50 paras and first tranches covering needed_approvals (30) , that is 198.000 bytes > 100KB max notification size
Adding in more validators given tranche distribution staying the same, we would get even larger messages.

We need to break these huge messages down in batches to something lower that what the protocol allows.

@sandreim sandreim added I3-bug Fails to follow expected behavior. F5-task U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible. labels Dec 6, 2022
@ordian
Copy link
Member

ordian commented Dec 6, 2022

needed approvals tranches (30)

needed_approvals is measured in approvals, not tranches. Typically, we'd need the first couple of tranches to cover them (most in 0th tranche).

@sandreim
Copy link
Contributor Author

sandreim commented Dec 6, 2022

needed approvals tranches (30)

needed_approvals is measured in approvals, not tranches. Typically, we'd need the first couple of tranches to cover them (most in 0th tranche).

corrected, that was what I was aiming at, poor choice of words

@burdges
Copy link
Contributor

burdges commented Dec 22, 2022

This is forwarding batches from multiple authorities between nodes, right?

@ordian
Copy link
Member

ordian commented Dec 22, 2022

The title is slightly misleading. What is does is opposite of batching, namely, it splits the assignment packets into chunks so they always fit into substrate/libp2p max_notification_size limit.

@burdges
Copy link
Contributor

burdges commented Dec 22, 2022

Ahh okay, but an assignment packet should be like 200 bytes, no? public key (32) + VRF signature (96) + candidate (32) + a few bytes describing the sampling. Or you need to attach the candidate receipt to all of them?

@ordian
Copy link
Member

ordian commented Dec 22, 2022

No, we do no attach the receipt, but

In unify_with_peer, we send all the assignments we've seen for all candidates at a given block which just came into the view of a peer, and with aggresion mode active this is even a bigger problem.

So a single message can contain multiple assignments.

Actually, for peer view changes, maybe we shouldn't propagate all assignments if this is our grid neighbor, it seems quite wasteful. But this is getting off-topic.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
I3-bug Fails to follow expected behavior. U1-asap No need to stop dead in your tracks, however issue should be addressed as soon as possible.
Projects
No open projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants