-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enforce migrations to be standalone scripts #296
Comments
Would this make migrations look more or less akin to database migrations for a server backend? I think that should ideally be what we're aiming for here. |
totally. |
I really like the idea of migrations not depending on the pallet code, however I'd like to chip in with my 2 cents on the subject, before we go down this particular rabbit hole. First of all, this will increase the bloat, because we'll need to redefine the structs used for the storage. In case of one or two structs that's kind of okay, when this becomes more complex with some nesting levels - different game. Secondly, it's going to become harder for newcomers to write migrations efficiently, especially when we start talking about more complex scenarios, which I'm going to provide as examples below. Last, but not least, it's kind of nice to have compile-time guarantees when writing migrations with more or less complicated structs involved, as when we are reusing the storage structs from pallets - we know for sure the struct resembles the storage. It's not that big of a deal as each migration needs to be tested anyway, but it still adds to the overall migration testing/implementation burden. Some examples of migrations that no longer work due to the fact that the code has changed:
pub mod v8 {
use super::*;
use crate::{Config, Nominators, Pallet, Weight};
use frame_election_provider_support::SortedListProvider;
use frame_support::traits::Get;
#[cfg(feature = "try-runtime")]
pub fn pre_migrate<T: Config>() -> Result<(), &'static str> {
frame_support::ensure!(
StorageVersion::<T>::get() == ObsoleteReleases::V7_0_0,
"must upgrade linearly"
);
crate::log!(info, "👜 staking bags-list migration passes PRE migrate checks ✅",);
Ok(())
}
/// Migration to sorted `VoterList`.
pub fn migrate<T: Config>() -> Weight {
if StorageVersion::<T>::get() == ObsoleteReleases::V7_0_0 {
crate::log!(info, "migrating staking to ObsoleteReleases::V8_0_0");
// We will no longer have the Bags-List pallet instance here, so we'll have to reimplement this functionality
// which relies in several method and storage structs that are called under the hood.
let migrated = T::VoterList::unsafe_regenerate(
// replicate the Nominators struct
Nominators::<T>::iter().map(|(id, _)| id),
Pallet::<T>::weight_of_fn(),
);
// still need to call that here on in post_migrate
debug_assert_eq!(T::VoterList::try_state(), Ok(()));
StorageVersion::<T>::put(ObsoleteReleases::V8_0_0);
crate::log!(
info,
"👜 completed staking migration to ObsoleteReleases::V8_0_0 with {} voters migrated",
migrated,
);
T::BlockWeights::get().max_block
} else {
T::DbWeight::get().reads(1)
}
}
#[cfg(feature = "try-runtime")]
pub fn post_migrate<T: Config>() -> Result<(), &'static str> {
T::VoterList::try_state().map_err(|_| "VoterList is not in a sane state.")?;
crate::log!(info, "👜 staking bags-list migration passes POST migrate checks ✅",);
Ok(())
}
}
pub mod v9 {
use super::*;
#[cfg(feature = "try-runtime")]
use frame_support::codec::{Decode, Encode};
#[cfg(feature = "try-runtime")]
use sp_std::vec::Vec;
/// Migration implementation that injects all validators into sorted list.
///
/// This is only useful for chains that started their `VoterList` just based on nominators.
pub struct InjectValidatorsIntoVoterList<T>(sp_std::marker::PhantomData<T>);
impl<T: Config> OnRuntimeUpgrade for InjectValidatorsIntoVoterList<T> {
fn on_runtime_upgrade() -> Weight {
if StorageVersion::<T>::get() == ObsoleteReleases::V8_0_0 {
// need to replicate bags list storage and some of the SortedListProvider methods
let prev_count = T::VoterList::count();
let weight_of_cached = Pallet::<T>::weight_of_fn();
// replicating validators struct
for (v, _) in Validators::<T>::iter() {
let weight = weight_of_cached(&v);
// a bunch more VoterList action that needs to be taken into account
let _ = T::VoterList::on_insert(v.clone(), weight).map_err(|err| {
log!(warn, "failed to insert {:?} into VoterList: {:?}", v, err)
});
}
log!(
info,
"injected a total of {} new voters, prev count: {} next count: {}, updating to version 9",
Validators::<T>::count(),
prev_count,
T::VoterList::count(),
);
StorageVersion::<T>::put(ObsoleteReleases::V9_0_0);
T::BlockWeights::get().max_block
} else {
log!(
warn,
"InjectValidatorsIntoVoterList being executed on the wrong storage \
version, expected ObsoleteReleases::V8_0_0"
);
T::DbWeight::get().reads(1)
}
}
#[cfg(feature = "try-runtime")]
fn pre_upgrade() -> Result<Vec<u8>, &'static str> {
frame_support::ensure!(
StorageVersion::<T>::get() == ObsoleteReleases::V8_0_0,
"must upgrade linearly"
);
let prev_count = T::VoterList::count();
Ok(prev_count.encode())
}
#[cfg(feature = "try-runtime")]
fn post_upgrade(prev_count: Vec<u8>) -> Result<(), &'static str> {
// try_state call
let prev_count: u32 = Decode::decode(&mut prev_count.as_slice()).expect(
"the state parameter should be something that was generated by pre_upgrade",
);
let post_count = T::VoterList::count();
let validators = Validators::<T>::count();
assert!(post_count == prev_count + validators);
frame_support::ensure!(
StorageVersion::<T>::get() == ObsoleteReleases::V9_0_0,
"must upgrade "
);
Ok(())
}
}
} In case of replicating, say, BagsList pallet storage and methods we'd have to replicate several pretty intricate methods along with their corresponding structs:
And this is just the tip of the iceberg. We could also take a look at a simple migration, like that one, which is already quite bloated and it's going to be double the bloat. Long story short I'm not sure that the added complexity and time spent implementing the migrations is worth being pallet-agnostic. Why? Breaking an older migration code is a minor concern, it can be commented out and/or provided as code example for posterity. Running prior migrations is not a problem either, just using release branches / webassembly code from older versions. It's a bit of a pain, but I don't see this happening often either. Happy to be proven wrong. |
You are forgetting that Polkadot isn't the only user of Substrate. For staking this currently maybe being true, but most of the other pallets are being used by a lot of Parachains etc. Your complaints are nevertheless correct, but we should work on this. We need to identify common patterns and then provide abstractions/macros/tools to help with that. |
paritytech/substrate#13107 tries to abstract the redundant version tracking and logging.
Uhf… yea that is annoying. Having mutation functions on stored structs makes this much more complicated. I dont see how we could circumvent that besides not using that pattern in the first place. Which is sub-optimal, since its in itself a good pattern. |
For now if anybody wants to migrate from a very old storage - they could use old release branches to gradually run whatever migrations they need, git blame will help them to find the right release. I know it's not ideal, but it's definitely doable. |
If I'm being honest I don't see why the actual library code should suffer for migration convenience. If anything it should be the other way around. Migrations are simply one-off helpers that serve the purpose, once. It's a bit annoying having to potentially use several different release branches to gradually upgrade, but it can be done and it's not that hard. Maybe we should provide some sort of convenience tooling for that instead? |
Now that I re-read the original issue it's a bit more clear to me what the purpose of this is: migration automation. Still, I can't see how this could be easily achieved as decoupling migrations from pallets is tricky enough, but we also still need to specify the correct storage version of each pallet should anyone start from the current codebase. And if this is wrong and somehow slips into master - that'll be potentially a huge pain as things would fail while trying to roll those migrations out automatically. It seems like manual control over migrations similar to what we have now, having to add them to individual runtimes, might be a better option. It's a bit more work, but it gives more fine-grained control of what's happening, since we need to compile a new runtime anyway in order to introduce any new code. I've had similar experiences with SQL migrations on various projects. If the migration is complex enough - it might be a multi-step one, that needs to be run and monitored manually. Even if it's not that complex - it's almost always a dangerous pattern to let it be run automatically or via CI. In our case the automation would come from simply compiling a new runtime with new migrations in place. All of the above concerns might be good to consider in order to see how we could mitigate some if not all of them while still providing a reasonable level of automation. |
I think we should not settle this yet and as Basti said try and find patters that are common and help migration scripts live longer. One of them is this: Which means:
Using, your PR which sparked this discussion can be solved fairly easily paritytech/substrate@2c6275e as the only breaking change is the introduction of That being said, this is still not perfect because i.e. if the definition of TLDR; I don't have a plan here, but I am not at a spot where I can also say screw the old migrations and let's move on. If Substrate wants to be ultimate, user-friendly framework that it wants to be, it must solve this issue, and I am down to put more effort into it. @ruseinov as your comments in #296, these are more about automation which is the end goal, but not the immediate issue. We first have to make migration scripts reusable, then think about automation, so I will not comment further on that for the time being.
That's not a bad idea nonetheless. But again, if you look at the fixes needed in paritytech/substrate@2c6275e, it is actually not that much overhead in most cases to keep the migrations sane. The idea that I am now eager to implement and show to all of you as PoC is this: I will migrate all of the migrations of the nomination pools pallet to be 100% independent of the runtime in some PR, and we can then see what the outcome looks like. If someone else has the time, please go ahead. |
Let us take a step back and think what would be the most convenient and easiest way to deliver migrations for our users: I think it is the old way of bundling them with the pallet. The pallet knows its current version and the version in storage. It runs all necessary migrations. The runtime just calls into all pallets for migrations on upgrade. Everything just works. But there are reasons why we are no longer doing it this way. Let me see if I got all of them:
I think we kind should find ways to address both issues instead of dumping this hugely complicated topic onto our users. |
This is not documented in the issue, but I think @kianenigma is not that big of a fan of this idea anymore, as discussed during FRAME steering meet. Or was it Staking meet? I don't remember. Anyhow, since we don't have a unified way/language (SQL-like) to deal with this and a lot of the migrations really need to mimic the business-logic of pallets whose storage they are touching - it seems like the best approach is to do what we've done before.
It seems like a good idea is to clean up old migrations. And let the users deal with this by gradually upgrading, if they need to. But yeah, the UX of this is not amazing. |
* Fix receipt's log transaction_hash * Block's total_difficulty default to 0 * Update ts-test
* Add pallet template from Substrate Dev Hub * Clean up un-needed stuff from template * Sketch out dispatchable interface * Introduce notion of finality chain * Add dependencies which were removed during a rebase * Sketch out idea for finality header-chain pallet * Sketch out ChainVerifier trait * Add storage parameter to verifier * Write out some things I think I need for finality verification * Add some pseudocode for marking finalized headers * Remove parity_scale_codec duplicate * Move verification logic into pallet I've been struggling with getting the generic types between the storage and verifier traits to play nice with each other. As a way to continue making progress I'm moving everything to the pallet. This way I hope to make progress towards a functional pallet. * Start doing verification around authority set changes * Remove commented BridgeStorage and ChainVerifier traits * Create Substrate bridge primitives crate * Add logic for updating scheduled authority sets * Introduce notion of imported headers * Implement basic header ancestry checker * Add mock runtime for tests * Add testing boilerplate * Add some storage read/write sanity tests * Add some basic header import tests * Add tests for ancestry proofs * Create helper for changing authority sets * Fix authority set test Fixes a problem with how the scheduled change was counted as well as a SCALE encoding issue * Correctly check for scheduled change digests There's no guarantee that the consensus digest item will be the last one in a header, which is how it was previously being checked. Thanks to Andre for pointing me to the Grandpa code that does this. * Mark imported headers as finalized when appropriate When a header that finalizes a chain of headers is succesfully imported we also want to mark its ancestors as finalized. * Add helper for writing test headers * Add test helper for scheduling authority set changes * Bump Substrate pallet and primitives to rc6 * Remove Millau verifier implementation * Add some doc comments * Remove some needless returns * Make Clippy happy * Split block import from finalization * Make tests compile again * Add test for finalizing header after importing children * Create a test stub for importing future justifications * Start adding genesis config * Reject justifications from future We should only be accepting justifications for the header which enacted the current authority set. Any ancestors of that header which require a justification can be imported but they must not be finalized. * Add explanation to some `expect()` calls * Start adding GenesisConfig * Plug genesis config into runtime * Remove tests module * Check for overflow when updating authority sets * Make verifier take ownership of headers during import * Only store best finalized header hash Removed the need to store the whole header, since we store it was part of the ImportedHeaders structure anyways * Add some helpers to ImportedHeader * Update ancestry checker to work with ImportedHeaders * Update ancestry tests to use ImportedHeaders * Update import tests to use ImportedHeaders * Clean up some of the test helpers * Remove stray dbg! * Add doc comments throughout * Remove runtime related code * Fix Clippy warnings * Remove trait bound on ImportedHeader struct * Simplify checks in GenesisConfig * Rename `get_header_by_hash()` * Alias `parity_scale_codec` to `codec` * Reword Verifier documentation * Missed codec rename in tests * Split ImportError into FinalizationError * Remove ChainVerifier trait This trait was a remenant of the original design, and it is not required at the moment. Something like it should be added back in the future to ensure that other chains which conform to this interface can be used by higher-level bridge applications. * Fix the verifier tests so they compile * Implement Deref for ImportedHeader * Get rid of `new` methods for some Substrate primitives * Ensure that a child header's number follows its parent's * Prevent ancestry checker from aimlessly traversing to genesis If an ancestor which was newer than the child header we were checking we would walk all the way to genesis before realizing that we weren't related. This commit fixes that. * Remove redundant clones * Ensure that old headers are not finalized Prevents a panic where if the header being imported and `best_finalized` were the same header the ancestry checker would return an empty list. We had made an assumption that the list would always be populated, and if this didn't hold we would end up panicking. * Disallow imports at same height as `best_finalized` * Fix Clippy warnings * Make NextScheduledChange optional * Rework how scheduled authority set changes are enacted We now require a justification for headers which _enact_ changes instead of those which _schedule_ changes. A few changes had to be made to accomodate this, such as changing when we check for scheduled change logs in incoming headers. * Update documentation for Substrate Primitives * Clarify why we skip header in requires_justification check * Add description to assert! call * Fix formatting within macros * Remove unused dependencies from runtime * Remove expect call in GenesisConfig * Turn FinalityProof into a struct * Add some inline TODOs for follow up PRs * Remove test which enacted multiple changes This should be added back at some later point in time, but right now the code doesn't allow for this behaviour. * Use `contains_key` when checking for header This is better than using `get().is_some()` since we skip decoding the storage value * Use initial hash when updating best_finalized * Add better checks around enacting scheduled changes * Rename finality related functions * Appease Clippy
## [0.8.3] - 2024-12-03 This release includes two fixes for small memory leaks on edge-cases in the notification and request-response protocols. ### Fixed - req-resp: Fix memory leak of pending substreams ([#297](paritytech/litep2p#297)) - notification: Fix memory leak of pending substreams ([#296](paritytech/litep2p#296)) cc @paritytech/networking --------- Signed-off-by: Alexandru Vasile <[email protected]>
## [0.8.3] - 2024-12-03 This release includes two fixes for small memory leaks on edge-cases in the notification and request-response protocols. ### Fixed - req-resp: Fix memory leak of pending substreams ([#297](paritytech/litep2p#297)) - notification: Fix memory leak of pending substreams ([#296](paritytech/litep2p#296)) cc @paritytech/networking --------- Signed-off-by: Alexandru Vasile <[email protected]>
## [0.8.3] - 2024-12-03 This release includes two fixes for small memory leaks on edge-cases in the notification and request-response protocols. ### Fixed - req-resp: Fix memory leak of pending substreams ([paritytech#297](paritytech/litep2p#297)) - notification: Fix memory leak of pending substreams ([paritytech#296](paritytech/litep2p#296)) cc @paritytech/networking --------- Signed-off-by: Alexandru Vasile <[email protected]>
## [0.8.3] - 2024-12-03 This release includes two fixes for small memory leaks on edge-cases in the notification and request-response protocols. ### Fixed - req-resp: Fix memory leak of pending substreams ([paritytech#297](paritytech/litep2p#297)) - notification: Fix memory leak of pending substreams ([paritytech#296](paritytech/litep2p#296)) cc @paritytech/networking --------- Signed-off-by: Alexandru Vasile <[email protected]>
Follow up on the discussion here: paritytech/substrate#10233 (comment). Currently, we keep some migration codes in the code, and even those that we do are pretty much useless. Given this our support for historical migrations are basically near-nothing.
I propose we enforce all migrations to be self-containing henceforth. This means that they should not depend on anything from the pallet.
If they need a struct, they must duplicate that version of it in their own local scope. If they need a particular storage item, they can mimic it using
generate_storage_alias
.Next to this, we are moving toward banning
on_runtime_upgrade
inside the pallet (paritytech/substrate#8713). This is aligned with this issue as well. To begin with, migrations should be controlled explicitly by the top-level runtime.Once we have all of this, I belive we can make migrations fully automated. Each migration will have a u32 id. Each pallet will have an on-chain u32 identifier, akin to the current storage version. Upon each runtime upgrade, if any migrations have been added to the repository will be executed. These migrations need to be added to some static registry.
Regardless of realizing the last statement or not, in order to for the migrations codes that we keep in substrate to have any utility, we better start by making them independent of the pallet. Some examples of this already exist (
pallet-elections-phragmen
).The text was updated successfully, but these errors were encountered: