-
Notifications
You must be signed in to change notification settings - Fork 1.6k
pvf-precheck: Integrate PVF pre-checking into paras module #4457
Conversation
Current dependencies on/for this PR:
This comment was auto-generated by Graphite. |
Oh sh~, TIL: "votings" is not a thing. |
c795d61
to
5ec1b45
Compare
a525c5e
to
5c65fa5
Compare
5ec1b45
to
8d5a539
Compare
5c65fa5
to
5d85e13
Compare
8d5a539
to
ddd68c4
Compare
5d85e13
to
a0d07b3
Compare
ddd68c4
to
cf3fb54
Compare
a0d07b3
to
02e1337
Compare
cf3fb54
to
ed68222
Compare
02e1337
to
d691fb6
Compare
d691fb6
to
13ed75f
Compare
@@ -73,6 +73,9 @@ pub struct HostConfiguration<BlockNumber> { | |||
/// This parameter affects the upper bound of size of `CandidateCommitments`. | |||
pub hrmp_max_message_num_per_candidate: u32, | |||
/// The minimum period, in blocks, between which parachains can update their validation code. | |||
/// | |||
/// If PVF pre-checking is enabled this should be greater than the maximum number of blocks | |||
/// PVF pre-checking can take. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean as in a timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to be vague that the pre-checking should finish first and only then a new upgrade will be allowed. It just so happens that the pre-checking timeout is typically takes the most time. However, it is also possible that the last decisive vote comes in on the last block of the session and thus ends the PVF voting, so it is not always timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exhaustiveness is imho better than trying to avoid it. Helps reviewing the code and on-boarding new people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good, I have to review a few sections more carefully:
- the "hack"
- ref counting consistency
All my comments are just nits, code looks pretty nice (minus the existence of "hack")
Error::<T>::PvfCheckInvalidSignature, | ||
); | ||
|
||
let mut active_vote = PvfActiveVoteMap::<T>::get(&stmt.subject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is there a particular reason we don't try_mutate
this in place? It feels a bit error prone having the vote tracking the vote at the very end in an else
case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it was also bothering me, but I think in this case cure is the same or worse than the disease.
In the case of achieving the quorum, we should remove the vote from the list and the map. Probably we can switch the entire storage value to be OptionQuery
, but then we will have to deal with this everywhere as well. Then, I am trying to put mutations of the map closer to the list, but in this case the mutation will happen inside of try_mutate
but the list mutation will be left alone.
We could perhaps refactor the code so that we return an enum which then says what to do with the vote: update or remove, but that's additional code only to make sure we do not accidentally miss this update, so it feels to me not worth it
Uh, also: could you include the ascii graph in the comment :) |
ah, right, I can add it in the next PR in the stack: #4459 |
4bc08c4
to
63b82d7
Compare
* pvf-precheck: Integrate PVF pre-checking into paras module Closes #4009 This is the most of the runtime-side change needed for #3211. Here is how it works. The PVF pre-checking can be triggered either by an upgrade or by onboarding (i.e. calling `schedule_para_initialize`). The PVF pre-checking process is identified by the PVF code hash that is being voted on. If there is already PVF pre-checking process running, then no new PVF pre-checking process will be started. Instead, we just subscribe to the existing one. If there is no PVF pre-checking process running but the PVF code hash was already saved in the storage, that necessarily means (I invite the reviewers to double-check this invariant) that the PVF already passed pre-checking. This is equivalent to instant approving of the PVF. The pre-checking process can be concluded either by obtaining a supermajority or if it expires. Each validator checks the list of PVFs available for voting. The vote is binary, i.e. accept or reject a given PVF. As soon as the supermajority of votes are collected for one of the sides of the vote, the voting is concluded in that direction and the effects of the voting are enacted. Only validators from the active set can participate in the vote. The set of active validators can change each session. That's why we reset the votes each session. A voting that observed a certain number of sessions will be rejected. The effects of the PVF accepting depend on the operations requested it: 1. All onboardings subscribed to the approved PVF pre-checking process will get scheduled and after passing 2 session boundaries they will be onboarded. 2. All upgrades subscribed to the approved PVF pre-checking process will get scheduled very similarly to the existing process. Upgrades with pre-checking are really the same process that is just delayed by the time required for pre-checking voting. In case of instant approval the mechanism is exactly the same. This is important from parachains compatibility standpoint since following the delayed upgrade requires the parachain to implement paritytech/cumulus#517. In case, PVF pre-checking process was concluded with rejection, then all the requesting operations get cancelled. For onboarding it means it gets without movement: the lifecycle of such parachain is terminated on the `Onboarding` state and after rejection the lifecycle is none. That in turn means that the caller can attempt registering the parachain once more. For upgrading it means that the upgrade process is aborted: that flashes go-ahead signal with `Abort` flag. Rejection leads to removing the allegedly bad validation code from the chain storage. Among other things, this implies that the operation can be re-requested. That allows for retrying an operation in case there was some bug. At the same time it does not look as a DoS vector due to the caching performed by the nodes. PVF pre-checking can be enabled and disabled. Initially, according to the changes in #4420, this mechanism is disabled. Triggering the PVF pre-checking when it is disabled just means that we insta approve the requesting operation. This should lead to the behavior being unchanged. Follow-ups: - expose runtime APIs * cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=polkadot-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/polkadot/src/weights/runtime_parachains_paras.rs * cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=westend-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/westend/src/weights/runtime_parachains_paras.rs * cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=kusama-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/kusama/src/weights/runtime_parachains_paras.rs * cargo run --quiet --release --features runtime-benchmarks -- benchmark --chain=rococo-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/rococo/src/weights/runtime_parachains_paras.rs * Review fixes Co-authored-by: Parity Bot <[email protected]>
This commit hooks up the API provided by #4457 to the runtime API subsystem. In a following PR this API will be consumed by the PVF pre-checking subsystem. Co-authored-by: Chris Sosnin <[email protected]>
This commit hooks up the API provided by #4457 to the runtime API subsystem. In a following PR this API will be consumed by the PVF pre-checking subsystem. Co-authored-by: Chris Sosnin <[email protected]>
This commit hooks up the API provided by #4457 to the runtime API subsystem. In a following PR this API will be consumed by the PVF pre-checking subsystem. Co-authored-by: Chris Sosnin <[email protected]> Co-authored-by: Chris Sosnin <[email protected]>
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: #4457 (comment)
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: #4457 (comment)
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: #4457 (comment)
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: #4457 (comment)
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: #4457 (comment)
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: #4457 (comment)
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: #4457 (comment)
* pvf-precheck: Integrate PVF pre-checking into paras module Closes #4009 This is the most of the runtime-side change needed for #3211. Here is how it works. The PVF pre-checking can be triggered either by an upgrade or by onboarding (i.e. calling `schedule_para_initialize`). The PVF pre-checking process is identified by the PVF code hash that is being voted on. If there is already PVF pre-checking process running, then no new PVF pre-checking process will be started. Instead, we just subscribe to the existing one. If there is no PVF pre-checking process running but the PVF code hash was already saved in the storage, that necessarily means (I invite the reviewers to double-check this invariant) that the PVF already passed pre-checking. This is equivalent to instant approving of the PVF. The pre-checking process can be concluded either by obtaining a supermajority or if it expires. Each validator checks the list of PVFs available for voting. The vote is binary, i.e. accept or reject a given PVF. As soon as the supermajority of votes are collected for one of the sides of the vote, the voting is concluded in that direction and the effects of the voting are enacted. Only validators from the active set can participate in the vote. The set of active validators can change each session. That's why we reset the votes each session. A voting that observed a certain number of sessions will be rejected. The effects of the PVF accepting depend on the operations requested it: 1. All onboardings subscribed to the approved PVF pre-checking process will get scheduled and after passing 2 session boundaries they will be onboarded. 2. All upgrades subscribed to the approved PVF pre-checking process will get scheduled very similarly to the existing process. Upgrades with pre-checking are really the same process that is just delayed by the time required for pre-checking voting. In case of instant approval the mechanism is exactly the same. This is important from parachains compatibility standpoint since following the delayed upgrade requires the parachain to implement paritytech/cumulus#517. In case, PVF pre-checking process was concluded with rejection, then all the requesting operations get cancelled. For onboarding it means it gets without movement: the lifecycle of such parachain is terminated on the `Onboarding` state and after rejection the lifecycle is none. That in turn means that the caller can attempt registering the parachain once more. For upgrading it means that the upgrade process is aborted: that flashes go-ahead signal with `Abort` flag. Rejection leads to removing the allegedly bad validation code from the chain storage. Among other things, this implies that the operation can be re-requested. That allows for retrying an operation in case there was some bug. At the same time it does not look as a DoS vector due to the caching performed by the nodes. PVF pre-checking can be enabled and disabled. Initially, according to the changes in #4420, this mechanism is disabled. Triggering the PVF pre-checking when it is disabled just means that we insta approve the requesting operation. This should lead to the behavior being unchanged. Follow-ups: - expose runtime APIs * cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=polkadot-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/polkadot/src/weights/runtime_parachains_paras.rs * cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=westend-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/westend/src/weights/runtime_parachains_paras.rs * cargo run --quiet --release --features=runtime-benchmarks -- benchmark --chain=kusama-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/kusama/src/weights/runtime_parachains_paras.rs * cargo run --quiet --release --features runtime-benchmarks -- benchmark --chain=rococo-dev --steps=50 --repeat=20 --pallet=runtime_parachains::paras --extrinsic=* --execution=wasm --wasm-execution=compiled --heap-pages=4096 --header=./file_header.txt --output=./runtime/rococo/src/weights/runtime_parachains_paras.rs * Review fixes Co-authored-by: Parity Bot <[email protected]>
This commit hooks up the API provided by #4457 to the runtime API subsystem. In a following PR this API will be consumed by the PVF pre-checking subsystem. Co-authored-by: Chris Sosnin <[email protected]> Co-authored-by: Chris Sosnin <[email protected]>
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: #4457 (comment)
This commit hooks up the API provided by paritytech#4457 to the runtime API subsystem. In a following PR this API will be consumed by the PVF pre-checking subsystem. Co-authored-by: Chris Sosnin <[email protected]> Co-authored-by: Chris Sosnin <[email protected]>
This commit incorporates the changes made to the runtime in the following PRs: - paritytech#4408 - paritytech#4457 - paritytech#4540 - paritytech#4542 - paritytech#4581 Note that this PR does not include the description of the PVF pre-checker subsystem. This should be addressed within paritytech#4611 Co-authored-by: sandreim <[email protected]>
Closes paritytech#3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: paritytech#4457 (comment)
Closes #3971 Read the linked issue. Apart from that, this addresses the concern raised in this [comment] by just adding a test. I couldn't find a clean way to reconcile a block number delay with a PVF voting TTL, so I just resorted to rely on the test. Should be fine for now. [comment]: paritytech/polkadot#4457 (comment)
Closes #4009
This is the most of the runtime-side change needed for #3211.
Here is how it works.
The PVF pre-checking can be triggered either by an upgrade or by
onboarding (i.e. calling
schedule_para_initialize
). The PVFpre-checking process is identified by the PVF code hash that is being
voted on. If there is already PVF pre-checking process running, then no
new PVF pre-checking process will be started. Instead, we just subscribe
to the existing one.
If there is no PVF pre-checking process running but the PVF code hash
was already saved in the storage, that necessarily means (I invite the
reviewers to double-check this invariant) that the PVF already passed
pre-checking. This is equivalent to instant approving of the PVF.
The pre-checking process can be concluded either by obtaining a
supermajority or if it expires.
Each validator checks the list of PVFs available for voting. The vote is
binary, i.e. accept or reject a given PVF. As soon as the supermajority
of votes are collected for one of the sides of the vote, the voting is
concluded in that direction and the effects of the voting are enacted.
Only validators from the active set can participate in the vote. The set
of active validators can change each session. That's why we reset the
votes each session. A voting that observed a certain number of sessions
will be rejected.
The effects of the PVF accepting depend on the operations requested it:
get scheduled and after passing 2 session boundaries they will be onboarded.
get scheduled very similarly to the existing process. Upgrades with
pre-checking are really the same process that is just delayed by the
time required for pre-checking voting. In case of instant approval the
mechanism is exactly the same. This is important from parachains
compatibility standpoint since following the delayed upgrade requires
the parachain to implement
Look at the upgrade go-ahead and restriction signals cumulus#517.
In case, PVF pre-checking process was concluded with rejection, then all
the requesting operations get cancelled. For onboarding it means it gets
without movement: the lifecycle of such parachain is terminated on the
Onboarding
state and after rejection the lifecycle is none. That inturn means that the caller can attempt registering the parachain once
more. For upgrading it means that the upgrade process is aborted: that
flashes go-ahead signal with
Abort
flag.Rejection leads to removing the allegedly bad validation code from the
chain storage. Among other things, this implies that the operation can
be re-requested. That allows for retrying an operation in case there was
some bug. At the same time it does not look as a DoS vector due to the
caching performed by the nodes.
PVF pre-checking can be enabled and disabled. Initially, according to
the changes in #4420, this mechanism is disabled. Triggering the PVF
pre-checking when it is disabled just means that we insta approve the
requesting operation. This should lead to the behavior being unchanged.
TODO:
UpgradeGoAheadSignal
is set inschedule_code_upgrade
it will be immediately reset innote_new_head
Follow-ups:
force_*
functions)include_pvf_check_statement
skip check-dependent-cumulus