Elastic scaling: runtime dependency tracking and enactment #3479

alindima · 2024-02-26T10:44:51Z

Changes needed to implement the runtime part of elastic scaling: #3131, #3132, #3202

Also fixes #3675

TODOs:

storage migration
optimise process_candidates from O(N^2)
drop backable candidates which form cycles
fix unit tests
add more unit tests
check the runtime APIs which use the pending availability storage. We need to expose all of them, see Add new staging Runtime API: candidates_pending_availability #3576
optimise the candidate selection. we're currently picking randomly until we satisfy the weight limit. we need to be smart about not breaking candidate chains while being fair to all paras - ParaInherent create: update apply_weight_limit wrt elastic scaling #3573

Relies on the changes made in #3233 in terms of the inclusion policy and the candidate ordering

Signed-off-by: alindima <[email protected]>

also no need to sort by core index any more

…. optimise process_candidates to be O(N)

sandreim

Logic looks good at first pass, but readability can certainly be improved.

I think we should also see if we can adjust apply_weight_limit candidate selection to account for elastic scaling.

polkadot/runtime/parachains/src/paras_inherent/mod.rs

…aling-runtime

sandreim · 2024-03-05T11:03:51Z

candidate_pending_availability runtime API is needed by collators. At first glance we might need to return all of them. ~~or maybe just the last one.~~

ordian

Overall looking good, left a couple of questions. Happy to approve once this is tested/burned-in.

ordian · 2024-03-06T04:15:38Z

polkadot/runtime/parachains/src/paras_inherent/mod.rs

-		// In `Enter` context (invoked during execution) there should be no backing votes from
-		// disabled validators because they should have been filtered out during inherent data
-		// preparation (`ProvideInherent` context). Abort in such cases.
-		if context == ProcessInherentDataContext::Enter {
-			ensure!(!votes_from_disabled_were_dropped, Error::<T>::BackedByDisabled);
-		}


why was this error removed? is it because it was merged into a generic CandidatesFilteredDuringExecution error? i liked the specificity of the previous errors more

I believe the original intention was to trade the specific errors for simplicity by using a catch all approach. I will look and see if we can keep it simple and have these errors specific or maybe logging these errors instead of returning them might achieve same.

that's right. I generally added debug logs to the filtering functions called in sanitize_backed_candidates whenever a candidate is filtered and the reason why it was dropped.

Indeed, checking that filtering filtered nothing at the outer most level is the most robust way to check.

polkadot/runtime/parachains/src/inclusion/mod.rs

polkadot/runtime/parachains/src/paras_inherent/mod.rs

polkadot/runtime/parachains/src/inclusion/mod.rs

…aling-runtime

alindima · 2024-03-19T08:15:34Z

Fixed the runtime API panic caused by #64 and reran benchmarks for westend and rococo

polkadot/runtime/parachains/src/runtime_api_impl/v7.rs

eskimor

Rococo weights seem off. Westend look good.

polkadot/runtime/parachains/src/paras_inherent/mod.rs

polkadot/runtime/parachains/src/runtime_api_impl/v7.rs

polkadot/runtime/parachains/src/paras_inherent/tests.rs

polkadot/runtime/rococo/src/weights/runtime_parachains_paras_inherent.rs

polkadot/runtime/westend/src/weights/runtime_parachains_paras_inherent.rs

polkadot/runtime/parachains/src/paras_inherent/mod.rs

alindima · 2024-03-19T16:19:46Z

yeah, the weights for rococo are way off when comparing to the previous values. I'm betting that's because they were last updated in 2021. The difference for westend shows that there isn't a significant change

…calingMVP is enabled

eskimor

Great work @alindima ! I couldn't help it and still had a few nits, but it is good to go!

eskimor · 2024-03-20T14:15:02Z

polkadot/runtime/parachains/src/paras_inherent/mod.rs

+		let freed = freed_concluded
+			.into_iter()
+			.map(|(c, _hash)| (c, FreedReason::Concluded))
+			.chain(freed_disputed.into_iter().map(|core| (core, FreedReason::Concluded)))


Not introduced here, but a third enum variant Disputed would have done no harm 😶‍🌫️ (and also no need to fix it here)

eskimor · 2024-03-20T14:26:06Z

polkadot/runtime/parachains/src/paras_inherent/tests.rs

+		// Cores 1, 2 and 3 are being made available in this block. Propose 6 more candidates (one
+		// for each core) and check that the right ones are successfully backed and the old ones
+		// enacted.
+		let config = default_config();


Given that we are not even sharing initialization, why is this not a separate test case?

to avoid long and dubious test names :D that's arguably a bad reason but I didn't think too much about it

polkadot/runtime/parachains/src/paras_inherent/tests.rs

…aling-runtime

sandreim

Amazing work @alindima

…h#3479) Changes needed to implement the runtime part of elastic scaling: paritytech#3131, paritytech#3132, paritytech#3202 Also fixes paritytech#3675 TODOs: - [x] storage migration - [x] optimise process_candidates from O(N^2) - [x] drop backable candidates which form cycles - [x] fix unit tests - [x] add more unit tests - [x] check the runtime APIs which use the pending availability storage. We need to expose all of them, see paritytech#3576 - [x] optimise the candidate selection. we're currently picking randomly until we satisfy the weight limit. we need to be smart about not breaking candidate chains while being fair to all paras - paritytech#3573 Relies on the changes made in paritytech#3233 in terms of the inclusion policy and the candidate ordering --------- Signed-off-by: alindima <[email protected]> Co-authored-by: command-bot <> Co-authored-by: eskimor <[email protected]>

On top of #5082. ## Background Previously, before #3479, we would [include](https://github.com/paritytech/polkadot-sdk/blame/75074952a859f90213ea25257b71ec2189dbcfc1/polkadot/runtime/parachains/src/builder.rs#L508C12-L508C44) the cost enacting the candidate into the cost of processing a single bitfield. [Now](https://github.com/paritytech/polkadot-sdk/blame/dd48544a573dd02da2082cec1dda7ce735e2e719/polkadot/runtime/parachains/src/builder.rs#L529) it is different, although the benchmarks seems to be not-up-to date. Including the cost of enacting a candidate into a processing a single bitfield cost was incorrect, since we multiple that by the number of bitfields we have. Instead, we should separate calculate the cost of processing a single bitfield without enactment, and multiple the cost of enactment by the actual number of processed candidates (which is limited by the number cores, not validators). ## Bench Previously, the weight of `enact_candidate` was calculated manually (without a benchmark) and then neglected: https://github.com/paritytech/polkadot-sdk/blob/dd48544a573dd02da2082cec1dda7ce735e2e719/polkadot/runtime/parachains/src/inclusion/mod.rs#L584 In this PR, we have a benchmark for it and it's based on the number of ump and sent hrmp messages as well as whether the candidate has a runtime upgrade (new_validation_code). The differences from the previous attempt paritytech/polkadot#6929 are that * we don't include the cost of enactment into the cost of processing a backed candidate. The reason for it is that enactment happens not in the same block as backing (typically the next one), since we process bitfields before backing votes. * we don't take into account the size of the runtime upgrade, the benchmark weight doesn't seem to depend much on it, but rather whether there was one or not. Similarly to the previous attempt, we don't account for dmp messages (fixed cost). Also we don't account properly for received hrmp messages (hrmp_watermark) because the cost of it depends on the runtime state and can't be statically deduced in the benchmark (unless we pass the information about channels as benchmark u32 arguments). The total weight cost of processing a parainherent now includes the cost of enactment of each candidate, but we don't do filtering based on that (because we enact after processing bitfields and making other changes to the storage). ## Numbers ``` Reads = 7 + (0 * u) + (3 * h) + (8 * c) Writes = 10 + (1 * u) + (3 * h) + (7 * c) ``` In addition, there is a fixed cost of a few of ms (!) per candidate. This might result a full block slightly overflowing its weight with 200 enacted candidates, which in turn could prevent non-mandatory transactions from being included in a block. Given our modest limits on max ump and hrmp messages: ``` maxUpwardMessageNumPerCandidate: 16 hrmpMaxMessageNumPerCandidate: 10 ``` and the fact that runtime upgrades are can't happen very frequently (`validation_upgrade_cooldown`), we might only go over the limits in case of many disputes. TODOs: - [x] Fix the overweight test - [x] Generate the weights for Westend and Rococo - [x] PRDoc --------- Co-authored-by: command-bot <> Co-authored-by: Alin Dima <[email protected]>

Initial draft changes

442185e

alindima marked this pull request as draft February 26, 2024 10:44

alindima added T8-polkadot This PR/Issue is related to/affects the Polkadot network. I5-enhancement An additional feature request. labels Feb 26, 2024

alindima added 9 commits February 26, 2024 16:45

bugfixes

799fabe

Signed-off-by: alindima <[email protected]>

filter descendants of disputed candidates

9f3ba62

also no need to sort by core index any more

some simplifications

5b13eb7

assert that candidates of a para are sorted in chain dependency order…

ecc5088

…. optimise process_candidates to be O(N)

deduplicate some of the logic for freeing cores

0fb7b8c

update some comments

0ed6fc3

unify dropped candidates errors

0a08fa0

add more logs

9e51e20

remove some todos

4e154e6

sandreim reviewed Feb 28, 2024

View reviewed changes

alindima added 12 commits February 29, 2024 10:07

review comments

ccef35b

some more nits

3f67898

Merge remote-tracking branch 'origin/master' into alindima/elastic-sc…

c4fedd7

…aling-runtime

add runtime migration to inclusion storage

58b4129

add migration tests

23a8d34

don't allow candidate cycles

e0d9dff

fix bug

0198fb3

make tests compile and make paras_inherent tests pass

267fa05

fix inclusion tests

7733b25

fix some more tests

c32f925

clippy

40e2933

Merge remote-tracking branch 'origin/master' into alindima/elastic-sc…

e8f5d2c

…aling-runtime

sandreim mentioned this pull request Mar 5, 2024

ParaInherent create: update apply_weight_limit wrt elastic scaling #3573

Merged

sandreim mentioned this pull request Mar 5, 2024

Add new staging Runtime API: candidates_pending_availability #3576

Closed

ordian reviewed Mar 6, 2024

View reviewed changes

Merge remote-tracking branch 'origin/master' into alindima/elastic-sc…

1b70b11

…aling-runtime

fix clippy

c3060bd

sandreim reviewed Mar 19, 2024

View reviewed changes

polkadot/runtime/parachains/src/runtime_api_impl/v7.rs Show resolved Hide resolved

eskimor reviewed Mar 19, 2024

View reviewed changes

polkadot/runtime/parachains/src/paras_inherent/mod.rs Outdated Show resolved Hide resolved

alindima added 4 commits March 20, 2024 13:04

map_candidates_to_cores: check core index for single core if ElasticS…

9f52b8d

…calingMVP is enabled

add a couple more test cases

c27de6a

review comment

e05693c

Merge branch 'master' into alindima/elastic-scaling-runtime

bb76778

eskimor approved these changes Mar 20, 2024

View reviewed changes

alindima added 2 commits March 21, 2024 10:40

improve test

91f705c

Merge remote-tracking branch 'origin/master' into alindima/elastic-sc…

0c8dc20

…aling-runtime

sandreim mentioned this pull request Mar 21, 2024

Add new variant of persisted_validation_data runtime API for elastic scaling #3776

Closed

sandreim approved these changes Mar 21, 2024

View reviewed changes

Merge branch 'master' into alindima/elastic-scaling-runtime

b8d2212

eskimor enabled auto-merge March 21, 2024 09:47

eskimor added this pull request to the merge queue Mar 21, 2024

Merged via the queue into master with commit 4842faf Mar 21, 2024
127 of 132 checks passed

eskimor deleted the alindima/elastic-scaling-runtime branch March 21, 2024 10:44

ordian mentioned this pull request Aug 7, 2024

inclusion: bench enact_candidate weight #5270

Merged

3 tasks

TDemeco mentioned this pull request Sep 27, 2024

feat: ⏫ upgrade to Polkadot SDK v1.10.0 Moonsong-Labs/storage-hub#210

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Elastic scaling: runtime dependency tracking and enactment #3479

Elastic scaling: runtime dependency tracking and enactment #3479

alindima commented Feb 26, 2024 •

edited

Loading

sandreim left a comment

sandreim commented Mar 5, 2024 •

edited

Loading

ordian left a comment

ordian Mar 6, 2024

sandreim Mar 6, 2024

alindima Mar 11, 2024

eskimor Mar 13, 2024

alindima commented Mar 19, 2024

eskimor left a comment

alindima commented Mar 19, 2024

eskimor left a comment

eskimor Mar 20, 2024

eskimor Mar 20, 2024

alindima Mar 20, 2024

eskimor Mar 20, 2024

sandreim left a comment

Elastic scaling: runtime dependency tracking and enactment #3479

Elastic scaling: runtime dependency tracking and enactment #3479

Conversation

alindima commented Feb 26, 2024 • edited Loading

sandreim left a comment

Choose a reason for hiding this comment

sandreim commented Mar 5, 2024 • edited Loading

ordian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alindima commented Mar 19, 2024

eskimor left a comment

Choose a reason for hiding this comment

alindima commented Mar 19, 2024

eskimor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sandreim left a comment

Choose a reason for hiding this comment

alindima commented Feb 26, 2024 •

edited

Loading

sandreim commented Mar 5, 2024 •

edited

Loading