Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add randomness to pallet-collator-assignment #267

Merged
merged 64 commits into from
Oct 19, 2023

Conversation

tmpolaczyk
Copy link
Contributor

@tmpolaczyk tmpolaczyk commented Sep 13, 2023

Add config param to pallet_collator_assigment that triggers a full rotation of all the assigned collators.

Full rotation is only triggered every 24 sessions (configurable). For the other sessions we keep the old behavior, except that the collators and the container chain lists are shuffled, so container chains with a low para id no longer have priority.

Collators may still rotate on normal sessions in the same cases as before this PR: if another collator leaves, or if a container chain joins or leaves.

The random shuffle does not happen in session 0, and it doesn't happen in rust integration tests unless the random seed is explicitly set. Zombienet tests use real randomness from the relay and there is no way to mock that randomness, so collator assignment will change every time these tests are run. Most of the tests are unit tests because there it is easier to control the randomness.

The full_rotation_period is configurable in pallet_configuration, the default value is 24 sessions but it can be changed. A value of 0 disables collator rotation.

@girazoki
Copy link
Collaborator

key to use: CURRENT_BLOCK_RANDOMNESS

@github-actions
Copy link
Contributor

github-actions bot commented Sep 14, 2023

Master coverage: 70.58%
Coverage generated "Thu Oct 19 12:11:39 UTC 2023":
https://d3gkbsry1ehhqi.cloudfront.net/tanssi-coverage/pulls/267/index.html
Pull coverage: 70.79%

tmpolaczyk and others added 23 commits September 14, 2023 18:41
Breaks zombienet tests because the collator2002 is not assigned to
container chain 2002
Because a different collator may be assigned to container 2002
There was a bug because we tried to read an Option<Hash> as a Hash, and
this results in the first byte of the [u8; 32] being always 1, which is
the marker for "Some"
Rotating every session does not leave enough time for collators to
detect their assignment, and some tests result in timeout
/// Randomness from previous block. Used to shuffle collators on session change.
/// Should only be set on the last block of each session and should be killed on the on_initialize of the next block.
/// The default value of [0; 32] disables randomness in the pallet.
#[pallet::storage]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I like this, I am also missing a way to disable it. This probably cannot serve as a kill switch, as it is set on_finalize of the last block before the session, so we would need to set it 0 on that block.

I think we can have an additional storage that serves as s kill switch, any opposition for this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ignore my comment, I forgot that we can do this by putting a very large rotation period

fn migrate(&self, _available_weight: Weight) -> Weight {
log::info!(target: LOG_TARGET, "migrate");

Configuration::set_full_rotation_period(RuntimeOrigin::root(), 24u32)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think this is going to work. set full rotation period schedules the configuration change, but wont be applied until after 2 sessions. During those two sessions the configuration wont be readible in the system, as the metadata is not the same

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to read the existing values in currentConfig and pendingConfigs, and modify so that it accepts the new structure

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some examples in paritytech/polkadot-sdk@3bbb336#diff-39d3b12d624dbf61df8f5ea6fc18a9f12611347f4d730388d6a96deafbba5a71, although I am not sure they are super clear. I think its better to just read whatever value it exists there with the previous struct format, and you just add the new parameter

Copy link
Collaborator

@girazoki girazoki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the PR is mergable provided that we mofidy the migration, good job!

/// Run a standard pre-runtime test. This works the same way as in a normal runtime upgrade.
#[cfg(feature = "try-runtime")]
fn pre_upgrade(&self) -> Result<Vec<u8>, sp_runtime::DispatchError> {
let zero_period = Configuration::config().full_rotation_period;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should indeed panic with try-runtime no? the config does not still have full_rotation_period

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming it defaults to 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case what happens is that Configuration::config() returns the default value, I am assuming because it is a ValueQuery. The default value has full_rotation_period = 24, so this test was wrong.

@tmpolaczyk tmpolaczyk merged commit 850faaf into master Oct 19, 2023
23 checks passed
@tmpolaczyk tmpolaczyk deleted the tomasz-shuffle-collator-assignement branch October 19, 2023 12:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants