-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Conversation
roadmap/implementers-guide/src/node/approval/approval-distribution.md
Outdated
Show resolved
Hide resolved
roadmap/implementers-guide/src/node/approval/approval-distribution.md
Outdated
Show resolved
Hide resolved
…tion.md Co-authored-by: Peter Goodspeed-Niklaus <[email protected]>
…tion.md Co-authored-by: Peter Goodspeed-Niklaus <[email protected]>
…h/polkadot into rh-approval-networking
roadmap/implementers-guide/src/node/approval/approval-distribution.md
Outdated
Show resolved
Hide resolved
roadmap/implementers-guide/src/node/approval/approval-distribution.md
Outdated
Show resolved
Hide resolved
roadmap/implementers-guide/src/node/approval/approval-distribution.md
Outdated
Show resolved
Hide resolved
1. Is a particular assignment relevant under a given `View`? | ||
2. Is a particular approval relevant to any assignment in a set? | ||
|
||
These two queries need not be perfect, but they must never yield false positives. For our own local view, they must not yield false negatives. When applied to our peers' views, it is acceptable for them to yield false negatives. The reason for that is that our peers' views may be beyond ours, and we are not capable of fully evaluating them. Once we have caught up, we can check again for false negatives to continue distributing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"it is acceptable for our conclusions to yield false negatives"
* If the message fingerprint appears under the `BlockEntry`'s `Knowledge`, give the peer a small positive reputation boost and return. Note that we must do this after checking for out-of-view to avoid being spammed. If we did this check earlier, a peer could provide data out-of-view repeatedly and be rewarded for it. | ||
* Dispatch `ApprovalVotingMessage::CheckAndImportAssignment(assignment)` and wait for the response. | ||
* If the result is `AssignmentCheckResult::Accepted(candidate_index)` or `AssignmentCheckResult::AcceptedDuplicate(candidate_index)` | ||
* If `candidate_index` does not match `claimed_candidate_index`, punish the peer's reputation, recompute the fingerprint, and re-do our knowledge checks. The goal here is to accept messages which peers send us that are labeled wrongly, but punish them for it as they've made us do extra work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"labeled wrongly"? Are these indexes the vacated cores number? If so, we could always include the index inside the assignment, even when doing a modulo VRF criteria for tranche zero, so it could be read before proceeding, which makes the assignment outright invalid if the index is wrong. Is that helpful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I'm unsure where claimed_candidate_index
originates. I suppose the peer sends us data it believes important, but why does it need to tell us why the data matters? We either agree or not, yes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think including that data in the VRF would make this point redundant. At the moment, the vacated core index isn't part of the VRF, and we have peers send us the vacated core index alongside the assignment which makes it easier for us to do some politeness checks.
So the conundrum this is solving is what we should do if a peer sends us something labeled with the wrong core index but that is still a valid assignment for some other core. And my answer was to re-label it, punish the peer, and distribute.
Your general advice so far has been to keep as much data as possible out of the VRF, so it didn't cross my mind that was an option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our VRF signatures have an input that corresponds to the output, which yes we minimize, but the same VRF also signs an extra auxiliary message, that influences only it proof, not its output.
We can pack anything we like into this extra auxiliary message, like whatever claims simplify the code, or even the entire gossip message containing the VRF (IndirectAssignmentCert
?). In fact, including the whole gossip message saves a signature that costs at least 45ms and 64 bytes per assignment message, so I slanted the draft code I wrote in this direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So it is possible to include the core index under the proof, even if that core index was unknown prior to the output being generated? That does indeed save us a ton of trouble. What's the schnorrkel
API for that?
* Dispatch a `ApprovalDistributionV1Message::Approval(approval)` to all peers in the `BlockEntry`'s `known_by` set, excluding the peer in the `source`, if `source` has kind `MessageSource::Peer`. Add the fingerprint of the assignment to the knowledge of each peer. Note that this obeys the politeness conditions: | ||
* We guarantee elsewhere that all peers within `known_by` are aware of all assignments relative to the block. | ||
* We've checked that this specific approval has a corresponding assignment within the `BlockEntry`. | ||
* Thus, all peers are aware of the assignment or have a message to them in-flight which will make them so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this sounds fine, but it's quadratic so another reason candidates should be rejected long before "escalation by DoS".
It breaks gossip assumptions if one compares fingerprints between peers using publicly keyed bloom filters, but if each pair of peers computes their own shared secret bloom filter key, then they could compare using bloom filters. Yet another optimization I guess.
* Broadcast on network with an `ApprovalNetworkingMessage::DistributeAssignment`. | ||
* Kick off approval work with `launch_approval` | ||
* Broadcast on network with an `ApprovalDistributionMessage::DistributeAssignment`. | ||
* Kick off approval work with `launch_approval`. Note that if the candidate appears in multiple current blocks, we will launch approval for each block it appears in. It may make sense to shortcut around this with caching either at this level or on the level of the other subsystems invoked by that function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these approval checkers being different is an unfortunate consequence of using the relay VRF modulo criteria for tranche zero.
If the relay chain largely avoids forks then the relay VRF modulo criteria ensures all validators get the same number of tranche zero assignments, which improves tranche zero assignments in several respects:
- We avoid a choice between either overworking nodes so much or else giving them more arbitrary choices about what they check.
- We estimate the number of checkers with a binimial distribution, which gives a variance n p q which looks less than the variance obtained from the more complex distributions resulting from other strategies.
- We could support validators of different computing capacities if we force validators to take assignments with sample numbers 1,..,k where k cannot exceed by too much their previous average number of samples completed on the same inclusion batch. In this way, we pay more for validators with more CPU but permit others to participate. This might not be a good thing, but nice option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All that sounds fine, but the point here is actually about the fact that we literally re-execute the candidate even if we've just validated it and we should have some LRU cache somewhere so that being "re-assigned" to the same candidate across multiple forks is ultra cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed interpreted and read this in terms of across validators, thanks!
LGTM :) |
@@ -153,7 +153,7 @@ On receiving an `OverseerSignal::ActiveLeavesUpdate(update)`: | |||
* Ensure that the `CandidateEntry` contains a `block_assignments` entry for the block, with the correct backing group set. | |||
* If a validator in this session, compute and assign `our_assignment` for the `block_assignments` | |||
* Only if not a member of the backing group. | |||
* Run `RelayVRFModulo` and `RelayVRFDelay` according to the [the approvals protocol section](../../protocol-approval.md#assignment-criteria) | |||
* Run `RelayVRFModulo` and `RelayVRFDelay` according to the [the approvals protocol section](../../protocol-approval.md#assignment-criteria). Ensure that the assigned core derived from the output is covered by the auxiliary signature aggregated in the `VRFPRoof`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine. The core is always either and input or an output already. If you want it to always be an input then we can just include it in the message and check it from the output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we feasibly include the core as an input to the VRF, given that it's derived from the output? That seems cyclical, but maybe through the power of cryptography...
@@ -166,6 +166,7 @@ On receiving a `ApprovalVotingMessage::CheckAndImportAssignment` message, we che | |||
* Check the assignment cert | |||
* If the cert kind is `RelayVRFModulo`, then the certificate is valid as long as `sample < session_info.relay_vrf_samples` and the VRF is valid for the validator's key with the input `block_entry.relay_vrf_story ++ sample.encode()` as described with [the approvals protocol section](../../protocol-approval.md#assignment-criteria). We set `core_index = vrf.make_bytes().to_u32() % session_info.n_cores`. If the `BlockEntry` causes inclusion of a candidate at `core_index`, then this is a valid assignment for the candidate at `core_index` and has delay tranche 0. Otherwise, it can be ignored. | |||
* If the cert kind is `RelayVRFDelay`, then we check if the VRF is valid for the validator's key with the input `block_entry.relay_vrf_story ++ cert.core_index.encode()` as described in [the approvals protocol section](../../protocol-approval.md#assignment-criteria). The cert can be ignored if the block did not cause inclusion of a candidate on that core index. Otherwise, this is a valid assignment for the included candidate. The delay tranche for the assignment is determined by reducing `(vrf.make_bytes().to_u64() % (session_info.n_delay_tranches + session_info.zeroth_delay_tranche_width)).saturating_sub(session_info.zeroth_delay_tranche_width)`. | |||
* We also check that the core index derived by the output is covered by the `VRFProof` by means of an auxiliary signature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The auxiliary message checked by the proof is just like a signature, so it gives us the ability to include stuff that is not in the VRF input. The core is either an input or output of the given criteria already, but we can include it in the auxiliary message and check that it is correct if that simplifies some code somewhere.
Closes #1603