Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Commit

Permalink
pvf-precheck: update implementers' guide (#4612)
Browse files Browse the repository at this point in the history
This commit incorporates the changes made to the runtime in the
following PRs:

- #4408
- #4457
- #4540
- #4542
- #4581

Note that this PR does not include the description of the PVF
pre-checker subsystem. This should be addressed within
#4611

Co-authored-by: sandreim <[email protected]>
  • Loading branch information
pepyakin and sandreim authored Dec 29, 2021
1 parent b25f8c5 commit 0f1a671
Show file tree
Hide file tree
Showing 10 changed files with 257 additions and 29 deletions.
4 changes: 4 additions & 0 deletions roadmap/implementers-guide/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- [Chain Selection and Finalization](protocol-chain-selection.md)
- [Architecture Overview](architecture.md)
- [Messaging Overview](messaging.md)
- [PVF Pre-checking](pvf-prechecking.md)
- [Runtime Architecture](runtime/README.md)
- [`Initializer` Module](runtime/initializer.md)
- [`Configuration` Module](runtime/configuration.md)
Expand All @@ -34,6 +35,7 @@
- [Candidate Events](runtime-api/candidate-events.md)
- [Disputes Info](runtime-api/disputes-info.md)
- [Candidates Included](runtime-api/candidates-included.md)
- [PVF Pre-checking](runtime-api/pvf-prechecking.md)
- [Node Architecture](node/README.md)
- [Subsystems and Jobs](node/subsystems-and-jobs.md)
- [Overseer](node/overseer.md)
Expand Down Expand Up @@ -66,6 +68,7 @@
- [Runtime API Requests](node/utility/runtime-api.md)
- [Chain API Requests](node/utility/chain-api.md)
- [Chain Selection Request](node/utility/chain-selection.md)
- [PVF Pre-Checking](node/utility/pvf-prechecker.md)
- [Data Structures and Types](types/README.md)
- [Candidate](types/candidate.md)
- [Backing](types/backing.md)
Expand All @@ -77,6 +80,7 @@
- [Network](types/network.md)
- [Approvals](types/approval.md)
- [Disputes](types/disputes.md)
- [PVF Pre-checking](types/pvf-prechecking.md)

[Glossary](glossary.md)
[Further Reading](further-reading.md)
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,19 @@ Output: Validation result via the provided response side-channel.

## Functionality

This subsystem answers two types of requests: one which draws out validation data from the state, and another which accepts all validation data exhaustively. The goal of both request types is to validate a candidate. There are three possible outputs of validation: either the candidate is valid, the candidate is invalid, or an internal error occurred. Whatever the end result is, it will be returned on the response channel to the requestor.
This subsystem groups the requests it handles in two categories: *candidate validation* and *PVF pre-checking*.

Parachain candidates are validated against their validation function: A piece of Wasm code that is describes the state-transition of the parachain. Validation function execution is not metered. This means that an execution which is an infinite loop or simply takes too long must be forcibly exited by some other means. For this reason, we recommend dispatching candidate validation to be done on subprocesses which can be killed if they time-out.
The first category can be further subdivided in two request types: one which draws out validation data from the state, and another which accepts all validation data exhaustively. Validation returns three possible outcomes on the response channel: the candidate is valid, the candidate is invalid, or an internal error occurred.

Parachain candidates are validated against their validation function: A piece of Wasm code that describes the state-transition of the parachain. Validation function execution is not metered. This means that an execution which is an infinite loop or simply takes too long must be forcibly exited by some other means. For this reason, we recommend dispatching candidate validation to be done on subprocesses which can be killed if they time-out.

Upon receiving a validation request, the first thing the candidate validation subsystem should do is make sure it has all the necessary parameters to the validation function. These are:
* The Validation Function itself.
* The [`CandidateDescriptor`](../../types/candidate.md#candidatedescriptor).
* The [`ValidationData`](../../types/candidate.md#validationdata).
* The [`PoV`](../../types/availability.md#proofofvalidity).

The second category is for PVF pre-checking. This is primarly used by the [PVF pre-checker](pvf-prechecker.md) subsystem.

### Determining Parameters

Expand Down
17 changes: 17 additions & 0 deletions roadmap/implementers-guide/src/node/utility/pvf-prechecker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# PVF Pre-checker

The PVF pre-checker is a subsystem that is responsible for watching the relay chain for new PVFs that require pre-checking. Head over to [overview] for the PVF pre-checking process overview.

## Protocol

There is no dedicated input mechanism for PVF pre-checker. Instead, PVF pre-checker looks on the `ActiveLeavesUpdate` event stream for work.

This subsytem does not produce any output messages either. The subsystem will, however, send messages to the [Runtime API] subsystem to query for the pending PVFs and to submit votes. In addition to that, it will also communicate with [Candidate Validation] Subsystem to request PVF pre-check.

## Functionality

TODO: Write up the description of the functionality of the PVF pre-checker. https://github.com/paritytech/polkadot/issues/4611

[overview]: ../../pvf-prechecking.md
[Runtime API]: runtime-api.md
[Candidate Validation]: candidate-validation.md
50 changes: 50 additions & 0 deletions roadmap/implementers-guide/src/pvf-prechecking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# PVF Pre-checking Overview

> ⚠️ This discusses a mechanism that is currently not under-development. Follow the progress under [#3211].
## Motivation

Parachains' and parathreads' validation function is described by a wasm module that we refer to as a PVF. Since it's a wasm module the typical way of executing it is to compile it to machine code. Typically an optimizing compiler consists of algorithms that are able to optimize the resulting machine code heavily. However, while those algorithms perform quite well for a typical wasm code produced by standard toolchains (e.g. rustc/LLVM), those algorithms can be abused to consume a lot of resources. Moreover, since those algorithms are rather complex there is a lot of room for a bug that can crash the compiler.

If compilation of a Parachain Validation Function (PVF) takes too long or uses too much memory, this can leave a node in limbo as to whether a candidate of that parachain is valid or not.

The amount of time that a PVF takes to compile is a subjective resource limit and as such PVFs may be maliciously crafted so that there is e.g. a 50/50 split of validators which can and cannot compile and execute the PVF.

This has the following implications:
- In backing, inclusion may be slow due to backing groups being unable to execute the block
- In approval checking, there may be many no-shows, leading to slow finality
- In disputes, neither side may reach supermajority. Nobody will get slashed and the chain will not be reverted or finalized.

As a result of this issue we need a fairly hard guarantee that the PVFs of registered parachains/threads can be compiled within a reasonable amount of time.

## Solution

The problem is solved by having a pre-checking process which is run when a new validation code is included in the chain. A new PVF can be added in two cases:

- A new parachain or parathread is registered.
- An existing parachain or parathread signalled an upgrade of its validation code.

Before any of those operations finish, the PVF pre-checking vote is initiated. The PVF pre-checking vote is identified by the PVF code hash that is being voted on. If there is already PVF pre-checking process running, then no
new PVF pre-checking vote will be started. Instead, the operation just subscribes to the existing vote.

The pre-checking vote can be concluded either by obtaining a supermajority or if it expires.

Each validator checks the list of PVFs available for voting. The vote is binary, i.e. accept or reject a given PVF. As soon as the supermajority of votes are collected for one of the sides of the vote, the voting is concluded in that direction and the effects of the voting are enacted.

Only validators from the active set can participate in the vote. The set of active validators can change each session. That's why we reset the votes each session. A voting that observed a certain number of sessions will be rejected.

The effects of the PVF accepting depend on the operations requested it:

1. All onboardings subscribed to the approved PVF pre-checking process will get scheduled and after passing 2 session boundaries they will be onboarded.
1. All upgrades subscribed to the approved PVF pre-checking process will get scheduled very similarly to the existing process. Upgrades with pre-checking are really the same process that is just delayed by the time required for pre-checking voting. In case of instant approval the mechanism is exactly the same.

In case PVF pre-checking process was concluded with rejection, then all the operations that are subscribed to the rejected PVF pre-checking process will be processed as follows. That is, onboarding or upgrading will be cancelled.

The logic described above is implemented by the [paras] module.

On the node-side, there is a PVF pre-checking [subsystem][pvf-prechecker-subsystem] that scans the chain for new PVFs via using [runtime APIs][pvf-runtime-api]. Upon finding a new PVF, the subsystem will initiate a PVF pre-checking request and wait for the result. Whenever the result is obtained, the subsystem will use the [runtime API][pvf-runtime-api] to submit a vote for the PVF. The vote is an unsigned transaction. The vote will be distributed via the gossip similarly to a normal transaction. Eventually a block producer will include the vote into the block where it will be handled by the [runtime][paras].

[#3211]: https://github.com/paritytech/polkadot/issues/3211
[paras]: runtime/paras.md
[pvf-runtime-api]: runtime-api/pvf-prechecking.md
[pvf-prechecker-subsystem]: node/utility/pvf-prechecker.md
22 changes: 22 additions & 0 deletions roadmap/implementers-guide/src/runtime-api/pvf-prechecking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# PVF Pre-checking

> ⚠️ This runtime API was added in v2.
There are two main runtime APIs to work with PVF pre-checking.

The first runtime API is designed to fetch all PVFs that require pre-checking voting. The PVFs are
identified by their code hashes. As soon as the PVF gains required support, the runtime API will
not return the PVF anymore.

```rust
fn pvfs_require_precheck() -> Vec<ValidationCodeHash>;
```

The second runtime API is needed to submit the judgement for a PVF, whether it is approved or not.
The voting process uses unsigned transactions. The [`PvfCheckStatement`](../types/pvf-prechecking.md) is circulated through the network via gossip similar to a normal transaction. At some point the validator
will include the statement in the block, where it will be processed by the runtime. If that was the
last vote before gaining the super-majority, this PVF will not be returned by `pvfs_require_precheck` anymore.

```rust
fn submit_pvf_check_statement(stmt: PvfCheckStatement, signature: ValidatorSignature);
```
42 changes: 32 additions & 10 deletions roadmap/implementers-guide/src/runtime/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,29 +12,51 @@ The configuration module is responsible for two main pieces of storage.
/// The current configuration to be used.
Configuration: HostConfiguration;
/// A pending configuration to be applied on session change.
PendingConfiguration: Option<HostConfiguration>;
PendingConfigs: Vec<(SessionIndex, HostConfiguration)>;
/// A flag that says if the consistency checks should be omitted.
BypassConsistencyCheck: bool;
```

## Session change

The session change routine for the Configuration module is simple. If the `PendingConfiguration` is `Some`, take its value and set `Configuration` to be equal to it. Reset `PendingConfiguration` to `None`.
The session change routine works as follows:

- If there is no pending configurations, then return early.
- Take all pending configurations that are less than or equal to the current session index.
- Get the pending configuration with the highest session index and apply it to the current configuration. Discard the earlier ones if any.

## Routines

```rust
enum InconsistentErrror {
// ...
}

impl HostConfiguration {
fn check_consistency(&self) -> Result<(), InconsistentError> { /* ... */ }
}

/// Get the host configuration.
pub fn configuration() -> HostConfiguration {
Configuration::get()
}

/// Updating the pending configuration to be applied later.
fn update_configuration(f: impl FnOnce(&mut HostConfiguration)) {
PendingConfiguration::mutate(|pending| {
let mut x = pending.unwrap_or_else(Self::configuration);
f(&mut x);
*pending = Some(x);
})
}
/// Schedules updating the host configuration. The update is given by the `updater` closure. The
/// closure takes the current version of the configuration and returns the new version.
/// Returns an `Err` if the closure returns a broken configuration. However, there are a couple of
/// exceptions:
///
/// - if the configuration that was passed in the closure is already broken, then it will pass the
/// update: you cannot break something that is already broken.
/// - If the `BypassConsistencyCheck` flag is set, then the checks will be skipped.
///
/// The changes made by this function will always be scheduled at session X, where X is the current session index + 2.
/// If there is already a pending update for X, then the closure will receive the already pending configuration for
/// session X.
///
/// If there is already a pending update for the current session index + 1, then it won't be touched. Otherwise,
/// that would violate the promise of this function that changes will be applied on the second session change (cur + 2).
fn schedule_config_update(updater: impl FnOnce(&mut HostConfiguration<T::BlockNumber>)) -> DispatchResult
```

## Entry-points
Expand Down
Loading

0 comments on commit 0f1a671

Please sign in to comment.