-
Notifications
You must be signed in to change notification settings - Fork 798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vision: Ergonomic multi-block operations #306
Comments
What we really want is async operations. We already have proper abstraction for it with years of experiences and I want to make sure we don't repeat the same mistakes and introduce more race condition bugs |
What kind of operations? For example iterators on multi_block! {
// migration code that will be split into multiple blocks.
StorageMap::translate(t);
} I think this is more complicated than having some kind of block number counter. |
The main reason we want to split some operations to multiple blocks is due weight limit, not just because we want to do it due to business reason (which I think is out of scope for this issue). Before a language natively supports async operations, we use callbacks to implement it. So it will be something like this fn do_work(i: u32) {
// do something with i
}
fn get_work() -> Option<u32> {
// return next work
}
fn get_and_do_work() -> Weight {
if let Some(work) = get_work() {
do_work(work)
multi_block_helper::execute_or_queue(Call::get_and_do_work)
return weight_info::get_and_do_work(work)
}
return weight_info::get_work()
}
fn migration() {
multi_block_helper::execute_or_queue(Call::get_and_do_work)
} Later, if we somehow get async support and some weight integration, we can make it #[weight = weight_info::do_work(i)]
fn do_work(i: u32) {
// do something with I
}
#[weight = weight_info::get_work()]
fn get_work() -> Option<u32> {
// return next work
}
async fn migration() {
while let Some(work) = get_work() {
do_work(work).await // check if remaining weight is enough, then execute, or defer it to next block
}
// code here will be executed after all the work is completed
} |
Okay so there are a few other cases where multi-block or multi-call aka. async operations would be useful: So it is not limited to migrations where this problem arises… |
It seems like this issue belongs to the FRAME board. I'd be very happy to take this on once we reach consensus on how to do it. While I was working on this here it dawned on me how painful it is to stub all the calls touching the storage out. It would be really great to be able to lock particular storage items for writing and/or reading while the migration is running, at least as an intermediate solution. In case of the example above - it would be great to be able to lock It would be great if we could store the |
I dont think locking storage regions can work. The API would be really ugly without infallible storage operations. Maybe we can write the changes of a migration into a new storage transaction (aka changeset) and commit that once its done. That would split the reading part (PoV intense) into multiple blocks and put the writing part into the last block. Ideally it just swaps two sub-tree hashes and does not copy anything. |
This would not work. How should that work for Parachains? The validation is always stateless. The changeset would also not be tracked as part of the state. |
I don't think it is possible to create a unified solution that solves multi-block operations for all use cases. Take an example for migration, there is no reason we have to write the code in such way that we have to complete migration in N blocks. Just make it a lazy migration. We have already implemented this lazy migration pattern in Substrate. For some other use cases, Acala have implemented an idle scheduler and we currently are using this to free up storages from removed EVM contracts. Similarly, this can be used for other time insensitive multi block operations. |
The original idea of this issue is being overlooked a bit, which is tailored towards providing a easy to use syntax to write multi-block operations, much like OpenMP's directives or Rust's Rayon macros. This certainly won't be generalizable. If you want a generalizable, writing it from scratch using on-idle or on-initialize (as done by Acala's scheduler) is not super hard. Note: the origin of this issue is in the NPoS project because I hoped to implement #465 using this. |
FWIW Async operations would also be a gamechanger for XCM/XCMP. Right now the best we can do is have a dispatchable be called when an XCM reply arrives. Far far better is if we could so a multi-block |
I think it's realistic to be able to mostly suspend the chain while a multi-block migration completes. Not for every pallet (session/system/staking/babe/grandpa come to mind as non-suspendable), but almost all of the others should be fine to pause without risking bricking the chain. |
Yes. I also have come to a similar conclusion. Proper multi block migration should involve migrating consensus related storage items in the first block that uses the new runtime and then all the other migrations can happen while the chain is paused. |
Sounds good. In the example that I mentioned before the situation with people being able to write to that particular storage absolutely does not work, and that's considering everything is happening on the relay chain only. |
We would block all extrinsics beside the inherents. Migrations will not run for that long time, so it is fine to not include extrinsics for a short period of time. |
Would be great to see this going forward. Happy to lend a hand where I can, sounds like an interesting problem to tackle with a lot to gain. I'm assuming there isn't a good way to "pause" pallets now. As far as I understand that would require some sort of flag that would switch things off under the hood until a certain cached storage item gets set/unset. |
paritytech/substrate#12092 could be used I think. Or maybe we need some more fundamental pausing of all extrinsics, not sure. |
No this should be exactly what we need/want. |
Looks sufficient to me too. |
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.37 to 0.1.38. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](dtolnay/async-trait@0.1.37...0.1.38) Signed-off-by: dependabot-preview[bot] <[email protected]> Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com> Co-authored-by: Svyatoslav Nikolsky <[email protected]>
I have started writing multi-block code in a few occasions already. There are multiple ways to abstract it, but this is the ultimate one:
This is merely an idea at this stage and there are a lot of caveats, which should be discussed in depth before an attempt at implementation. One I have more concrete ideas about this I will move it to the frame vision project.
The text was updated successfully, but these errors were encountered: