Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Assert Storage Key & Value Invariance #13037

Closed
mustermeiszer opened this issue Dec 30, 2022 · 5 comments
Closed

Assert Storage Key & Value Invariance #13037

mustermeiszer opened this issue Dec 30, 2022 · 5 comments

Comments

@mustermeiszer
Copy link
Contributor

mustermeiszer commented Dec 30, 2022

The reason for this issue is that silent changes in the scale-coding of types -- through changes in the types themselves, or unlikely, changes in the codec -- used for storage keys and values worry me more and more. Even more, since the latest migrations in Substrate mostly were triggered via OnRuntimeUpgrade instead of hooks.

Proposal

My proposal for a solution would be the following. I have not coded this already, just checked that the traits work and the static assertion fails, but I am not sure if an auto-derive is trivial and so forth.

  • Codec provides 2 additional traits + auto-derive for StaticEncode

    #[const_trait]
    trait StaticEncode<const BYTES: usize> {
        fn zeroed_and_uninit() -> [u8; BYTES];
    }
    
    #[const_trait]
    trait Lock<T: ~const StaticEncode<BYTES>, const BYTES: usize> {
        const LOCK: [u8; BYTES];
        
        fn check_lock() {
            let current = T::zeroed_and_uninit();
            let lock = Self::LOCK;
            
            let mut i = 0;
            while i < BYTES {
                assert!(current[i] == lock[i], "Static encoding does not match lock!");
                i += 1;
            }
        }
    }
  • Storage items of pallets automatically implement the following impl block
    Users can opt-out of this by adding pallet::without_storage_lock

    // Assuming `StorageMap<T::Key, T::Value>`
    impl<T: Config>  Pallet<T> 
    where 
       T: Lock<<T as Config>::Key, BYTES>,
       T: Lock<<T as Config>::Value, BYTES>,
    {
       const _: () =  { <T as Lock<T as Config>::Key, BYTES>::check_lock(); };
       const _: () =  { <T as Lock<T as Config>::Value, BYTES>::check_lock(); };
    } 
  • Runtimes must implement Lock<T> for every T that is a key or a value

Advantages

  • Compilation ensures that codec of storage relevant types has not changed
  • Dependencies are also checked automatically → no silent errors here

Problems

  • Getting the BYTES for associated types in pallets needs to be constant but runtime specific
  • Overhead for runtimes
  • Is there a "static" representation for every codec object that can be storage key or value?
    For enums it would be needed to actually encode all variants in the "static" representation in order to ensure no variant has changed its encoding.
    • Other relevant things to pay attention to?

Edits:

  • Typos
  • Changed T (i.e. the runtime) to implement the locks
@bkchr
Copy link
Member

bkchr commented Dec 31, 2022

I don't really get how this should work.

The following idea goes into the direction you want to achieve here: paritytech/polkadot-sdk#241

@mustermeiszer
Copy link
Contributor Author

What don't you understand? ^^

paritytech/polkadot-sdk#241 looks really promising, but it requires either fetching the whole state of a chain -- at least one key for each entry -- or having a snapshot available.
This approach would rather be a static, compiler-based assurance, that the codec did not change. TBH, the approach you are taking ensures the same, but how feasible is it to iterate over the complete storage of a chain in the long run?

@bkchr
Copy link
Member

bkchr commented Jan 1, 2023

TBH, the approach you are taking ensures the same, but how feasible is it to iterate over the complete storage of a chain in the long run?

Why shouldn't this be feasible? I mean you would do this as part of your testing setup, nothing that runs on chain. I don't see any reason this shouldn't work?

What don't you understand? ^^

Where this lock value is coming from and the other zeroed_and_uninit() is coming from. That also sounds like a lot of mostly useless values "floating" around. From where will pallet users get these values? How do they update these values?

@ggwpez
Copy link
Member

ggwpez commented Jan 2, 2023

I dont think you can statically assert that the encoding of a type did not change. The compiler would have to prove to you that the code does the same as before - which is impossible. What you propose IIUC is some fuzzy testing.

Compilation ensures that codec of storage relevant types has not changed

This would only check that the one instance of the type does not change its encoding, or?
All other instances are still free to change.

Is there a "static" representation for every codec object that can be storage key or value?

You would need to generate all instances of that type, so to check that the encoding of an u32 did not change you would need to run your test 2^32 times. I dont think this is feasible.

paritytech/polkadot-sdk#241 looks really promising, but it requires either fetching the whole state of a chain -- at least one key for each entry -- or having a snapshot available.

I dont think that is a problem. Fetching the whole state currently takes about 14 minutes, which is done to check runtime-migrations in the CI eg here.
We can easily add the mentioned DecodeAll check into the list of migrations there.
That would not be a static or compiler-based approach, but should still give a good idea if something broke. Work is ongoing here #13013

If you really want to do that, you could think about some TestCodec trait, which is then used in the runtime integrity tests and contains hard-coded values. Then you can optionally implement that for key/value types and rely on the runtime to run the tests.

@mustermeiszer
Copy link
Contributor Author

mustermeiszer commented Jan 5, 2023

You would need to generate all instances of that type, so to check that the encoding of an u32 did not change you would need to run your test 2^32 times. I dont think this is feasible.

This argument holds also for the DecodeAll check, just that you are checking a bigger subset of the overall set.

But 14 minutes are actually great.

If you really want to do that, you could think about some TestCodec trait, which is then used in the runtime integrity tests and contains hard-coded values. Then you can optionally implement that for key/value types and rely on the runtime to run the tests.

This is the idea here. I would argue that for every primitive type in rust, that is also serializable for substrate-storage, there exist a valid zero-value-encoding. The only exception would be enums, where the zero-encoding would need to encode each variant after the other as a valid encoding to check against.

But as you said, this would only indicate that a migration is needed, while DecodeAll verifies that migrations are done correctly. So closing this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants