Runtime Storage #38

danforbes · 2020-04-09T20:35:52Z

Closes #5

danforbes · 2020-04-15T09:42:38Z

@thiolliere can you please review this?

gui1117

I feel like explaining what each methods do in the trait is a bit redundant with the trait documentation.
I'm not sure of the scope of this doc but I feel like part of it is redundant with decl_storage and traits doc.

Also in my opinion storage doc should be more about explaining the trie and then just saying:

value is an abstraction which stores one value at a certain path in the trie
map stores value at some_path++hasher(key)
double map stores avalue at some_path++ hasher1(key)++hasher2(key)

I think then it is more understandable what are the cost of methods.

But I'm not very aware about how is organized developer-hub so this is just humble suggestions.

by the way I think Shawn deep dive was very good explanation of trie https://www.shawntabrizi.com/substrate/substrate-storage-deep-dive/

current/runtime/storage.md

gui1117 · 2020-04-15T10:03:03Z

current/runtime/storage.md

+
+### Storage Maps
+
+Storage Maps are implemented as hash maps, which is a pattern that should be familiar to most developers. In order to


Storage Maps are implemented as hash maps

maybe it is only me but for me hash map is a data structure which handle collision. at least if presented this way it should be clear that collision are not handled.

But actually instead of using "hash map" term I would rather say that it is a map where key are hashed.

Thank you for this clarification. Please let me know if you think my solution was sufficient.

hmm I see, I'm not sure actually what I proposed was better, like

Storage Maps are implemented as maps with hashed keys, which is a pattern that should be familiar to most developers.

wouldn't have make sense to me I think if I didn't know it already, not sure how to do better. (still for me the clearest is that a storage map insert at the path hash(encode(key)) the value encoded.)

Tried to clarify this in the docs 👍

gui1117 · 2020-04-15T10:10:20Z

current/runtime/storage.md

+Storage Maps are implemented as hash maps, which is a pattern that should be familiar to most developers. In order to
+give blockchain engineers increased control over the way in which these data structures are stored, Substrate allows
+developers to select the hashing algorithm that is used to generate map keys. Map data structures are ideal for managing
+sets of items whose elements will be accessed randomly, as opposed to iterating over them sequentially in their entirety.


I think it would be better to show how it is implemented in the underlying trie to make it more understandable about advantage/disadvantage.

Basically the value is put in the trie at the path==some_storage_prefix++hashed_key.
So each value is stored in an individual node, so reading/writing a value only decode/encode one raw_value which is stored in one trie-node.

also the performance of the map will depends on how big the map is because read/writing one value will recompute the path==some_prefix++hashed_key. and the hashed_key part can be more or less costy to compute. https://www.shawntabrizi.com/substrate/transparent-keys-in-substrate/

There is a separate document that goes over how storage is implemented. This document is meant to explain Substrate's storage interfaces and help blockchain runtime developers understand when and how to use them.

this document will not explain how frame_support::storage abstraction are implemented in storage do they ? so I would explain their implementation here actually.

I believe the purpose of the advance document is to cover implementation and the purpose of this document (the runtime document) is to cover the interface. @joepetrowski can you provide your input?

gui1117 · 2020-04-15T10:13:08Z

current/runtime/storage.md

+
+[Storage Maps expose an API](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageMap.html#required-methods)
+that is similar to that of Storage Values. The selected methods referenced below all use a single key; use two keys for
+[Storage Double Maps](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageDoubleMap.html#required-methods):


Here also I think we should say that using this two keys result in this in the trie: key1, key2, value is stored at
value is stored at some_prefix++hashe1(key1)++hasher2(key2).
and so you can iterate on one key1.

This is information I also think would belong in the advanced document.

gui1117 · 2020-04-15T10:16:37Z

current/runtime/storage.md

+The Substrate storage API provides iterable map implementations. Because maps are often used to track unbounded sets of
+data (account balances, for example) it is especially likely to exceed block production time by iterating over maps in
+their entirety within the runtime. Furthermore, because maps are comprised of more layers of indirection than native
+lists, they are significantly more costly than lists to iterate over with respect to time. Depending on


I think we can just explain what is the "more layers of indirection than native lists", basically iterating over a map is just iterating over all keys after the prefix storage_prefix so this trie iteration is more costy than native list because it is a trie.

I felt like that was a step too far for this document and that topics like that belong in the advanced storage documentation.

The above document just explain how the trie works not how frame_support storage abstractions are implemented using the trie, no ?
As a user in order to understand more precisely the cost of those abstraction, explaining their implementation on the trie seems useful (maybe it belong somewhere else in the doc but still I feel like saying that a storage map just store all its value after a prefix and iterating upon means iterating the trie after a prefix is quite straightforward and quickly understandable)

Because maps are often used to track unbounded sets of
data it is especially likely to exceed block production time by iterating over maps in
their entirety within the runtime

This is just a trade off, some storage map in the runtime are quite bounded actually. But we use storage map because they are more often accessed by individual value than as a whole. Thus using a storage map is more efficient because accessing one value is cheaper, but it is sometime iterated upon still because it is not too big.

Tried to clarify this in the docs 👍

current/runtime/storage.md

gui1117 · 2020-04-15T10:19:15Z

current/runtime/storage.md

+#### Methods
+
+[Iterable Storage Maps expose the following methods](https://substrate.dev/rustdocs/master/frame_support/storage/trait.IterableStorageMap.html#required-methods)
+in addition to the other map methods:


their method are slightly different for map and doublemap

Tried to clarify this in the docs 👍

gui1117 · 2020-04-15T10:20:26Z

current/runtime/storage.md

+##### `translate(fn)`
+
+Use the provided function to translate all elements of the map, in no particular order. To remove an element from the
+map, return `None` from the translation function.


they also differ between map and double_map (as a note I tried to unify interface here paritytech/substrate#5335)

Tried to clarify this in the docs 👍

danforbes · 2020-04-19T15:29:58Z

Thank you @thiolliere! I have tried to address your comments while balancing the goal of this document with respect to the goal of the advanced storage document.

joepetrowski

Looks good. I think a practical section on best practices would be useful.

current/runtime/storage.md

shawntabrizi · 2020-04-22T16:14:08Z

Is the goal of this PR to actually have all this content on a single page? Or are you just collecting information here?

Jumping between conceptual items around storage, to macro syntax about declaration, to RPC call stuff seems like a lot for one page, and what I would consider to be quite different topics when it comes to storage.

I might tune your focus into two sections with somewhat different goals:

Conceptually what is Runtime Storage in Substrate?
- Goal: Someone will understand from a theoretical perspective how substrate storage works, and the design decisions.
Using storage in Substrate runtime code
- Goal: How do work with the macro, the generated types, etc... what are all the nuiances of the syntax, things you can do, cant do etc...

Someone who does not really want to make the best decisions but just wants code to work should read the second doc. Someone who just wants to learn conceptually whats going on should read the first. A "real dev" would read both and be able to make the right decisions using the combined information.

If you break down Substrate Storage Conceptually, then a topic like the RPC queries become a lot more natural "We are just querying a database key, no other abstractions of storage leak through the RPC".

It also informs the user the advantages of a single value storage item versus a storage map, something that is probably the most important topic to take away for the average dev. They should be thinking at this level.

Then when it comes to the ergonomics of the language, some users just want to know what to type to make the thing compile, and that would go into all the syntax level details I see here.

As it is written, I think it is hard to ingest and get a clear message imo.

danforbes · 2020-04-28T14:15:21Z

@thiolliere and @shawntabrizi - I have made some structural changes that I hope will address your comments. @joepetrowski - per our conversation yesterday, I think this is now ready for a final review.

current/runtime/storage.md

current/advanced/storage.md

Closes #5

danforbes · 2020-04-30T14:29:07Z

@joepetrowski - any suggestions for enhancing the section on genesis configuration?

current/runtime/storage.md

joepetrowski · 2020-04-30T14:55:24Z

current/runtime/storage.md

+### Genesis Config

-TODO
+You can define
+[an optional `GenesisConfig`](https://substrate.dev/rustdocs/master/frame_support/macro.decl_storage.html#genesisconfig)
+struct in order to initialize Storage Items in the genesis block of your blockchain.

-## Storage Cache
+// TODO


@thiolliere what info do you think should go here?

GenesisConfig is generated by decl_storage macro, it does initialize storage items for the genesis block of the blockchain.

Its logic is defined by:

config and build attribute when declaring an individual storage.

add_extra_genesis informations. which allows to add fields in the GenesisConfig struct and a build function to build storages using those fields.

(I can be more precise actually we execute first for each storage the build closure if it exist, or if no build attribute but a config attribute then we add into storage the value found in config. Then we execute the add_extra_genesis::build function.)

kianenigma · 2020-05-02T11:12:49Z

current/runtime/storage.md

+Depending on [the hashing algorithm](#Transparent-Hashing-Algorithms) that you select to generate a
+map's keys, you may be able to iterate across its keys and values. Because maps are often used to
+track unbounded sets of data (account balances, for example) it is especially likely to exceed block
+production time by iterating over maps in their entirety within the runtime. Furthermore, because


Furthermore, because
accessing the elements of a map requires more pointer dereferencing than accessing the elements of a
native list, maps are significantly more costly than lists to iterate over with respect to time.

Can you explain this a bit more? I am not sure where it comes from and until now my understanding of the state was that the most important factors encoding + db access. In-memory dereferencing seems negligible to me.

(I agree that iterating maps is expensive, I am not sure if your reason is correct.)

So, this line used to read as follows:

Furthermore, because maps are comprised of more layers of indirection than native lists...

I agree with you that the change didn't necessarily make things more clear. I guess what I meant above was that you don't need to dereference memory pointers, per se, but you do have to dereference the pointer to the element in the map...I think I'm trying to describe the "db access" part that you mention above. I will think about alternate ways to phrase this but would definitely appreciate any suggestions 👍

Okay, this now reads:

...accessing the elements of a map requires more database reads than accessing the elements of a native list...

I hope you agree the simplest approach was the best one in this case 😅

kianenigma · 2020-05-02T11:14:26Z

current/runtime/storage.md

+wrong. Being efficient within the runtime of a blockchain is an important first principle of
+Substrate and this information is designed to help you understand _all_ of Substrate's storage
+capabilities and use them in a way that respects the important first principles around which they
+were designed.


would be nice to mention here when and which maps are iterable at the end of the day.

This is a great point and I'm starting to wonder if we should even document non-transparent hashers since they are now deprecated. Should we just behave as if all hashers are transparent and all maps are iterable?

Asked @shawntabrizi about this offline and he agrees that we should take the simpler approach: only document that which is not deprecated 👍

Tried to clarify this in the docs 👍

current/runtime/storage.md

kianenigma

I only had a skim and the text reads ver well to me. As Shawn mentioned I think this is a lot of info and should be organised well so the user can follow it easily. Great work 👍

gui1117 · 2020-05-04T08:48:27Z

Sorry I didn't realized earlier but it seems non-transparent hasher are deprecated so:

/// Supported hashers (ordered from least to best security):
///
/// * identity - Just the unrefined key material. Use only when it is known to be a secure hash
/// already. The most efficient and iterable over keys.
/// * twox_64_concat - TwoX with 64bit + key concatenated. Use only when an untrusted source
/// cannot select and insert key values. Very efficient and iterable over keys.
/// * blake2_128_concat - Blake2 with 128bit + key concatenated. Slower but safe to use in all
/// circumstances. Iterable over keys.
///
/// Deprecated hashers, which do not support iteration over keys include:
/// * twox_128 - TwoX with 128bit.
/// * twox_256 - TwoX with with 256bit.
/// * blake2_128 - Blake2 with 128bit.
/// * blake2_256 - Blake2 with 256bit.
///

Thus maybe better to write that storage map are iterable and double map are also iterable, and maybe add a note that this is not the case if you use deprecated hasher opaque_twox_128, opaque_... ...

danforbes · 2020-05-05T17:19:33Z

@shawntabrizi, @thiolliere, @kianenigma - I have no more changes to make at this time based on your comments. Let me know if I addressed your concerns properly or if there are additional concerns you have.

gui1117

looks good to me

gui1117 · 2020-05-07T12:22:14Z

current/runtime/storage.md

+| [TwoX 64 Concat](https://crates.parity.io/frame_support/struct.Twox64Concat.html)          |               | X           |
+| [Identity](https://crates.parity.io/frame_support/struct.Identity.html)                    |               |             |
+| [Blake2 128](https://crates.parity.io/frame_support/struct.Blake2_128.html) **DEPRECATED** | X             |             |
+| [TwoX 128](https://crates.parity.io/frame_support/struct.Twox128.html) **DEPRECATED**      |               |             |


there is actually 4 depracated hasher but I don't mind if we don't mention them:

twox_128 - TwoX with 128bit.
twox_256 - TwoX with with 256bit.
blake2_128 - Blake2 with 128bit.
blake2_256 - Blake2 with 256bit.

current/runtime/storage.md

gui1117 · 2020-05-07T12:30:07Z

current/runtime/storage.md

+[the `sc_chain_spec::ChainSpec` trait](https://crates.parity.io/sc_chain_spec/trait.ChainSpec.html).
+For a complete and concrete example of using Substrate's genesis storage configuration capabilities,
+refer to the `decl_storage` macro in
+[the Society pallet](https://github.com/paritytech/substrate/blob/master/frame/society/src/lib.rs)


should it be better to point to example pallet ?

I don't believe that the example pallet does any genesis configuration.

gui1117 · 2020-05-07T12:35:30Z

current/runtime/storage.md

+
+  // all storage writes go here; no throwing code below this line
+
+  // all event emissions go here


events can be interleaved with storage writes

gui1117 · 2020-05-07T12:42:10Z

current/runtime/storage.md

+initializing storage, all of which have entry points in the `decl_storage` macro. These mechanisms
+all result in the creation of a `GenesisConfig` data type that implements
+[the `sp_runtime::BuildModuleGenesisStorage` trait](https://crates.parity.io/sp_runtime/trait.BuildModuleGenesisStorage.html)
+and will be added to the storage item's module (e.g.


I'm not sure I read this correctly, what does the runtime GenesisConfig being added to the storage item's module means ?

Maybe we could make it more clear that there is a GenesisConfig overarching type for the runtime which is made by construct_runtime by joining the specified pallet genesis config (specified with Config)

Co-authored-by: thiolliere <[email protected]>

joepetrowski

🚀

danforbes force-pushed the runtime-storage branch 3 times, most recently from 5860324 to be692c3 Compare April 14, 2020 08:34

joepetrowski added the A3 - In Progress Not ready for review yet. label Apr 14, 2020

danforbes force-pushed the runtime-storage branch 2 times, most recently from 1aa1627 to e2fba2d Compare April 15, 2020 06:49

gui1117 reviewed Apr 15, 2020

View reviewed changes

danforbes force-pushed the runtime-storage branch from 0d455a6 to cfe58eb Compare April 19, 2020 15:21

danforbes mentioned this pull request Apr 20, 2020

Document SCALE Optimizations for Integer Types and Vectors Thereof #43

Open

danforbes force-pushed the runtime-storage branch from 9808d9f to 5301a8a Compare April 21, 2020 10:52

joepetrowski reviewed Apr 22, 2020

View reviewed changes

danforbes requested a review from shawntabrizi April 22, 2020 15:50

danforbes force-pushed the runtime-storage branch 3 times, most recently from df43692 to f818588 Compare April 28, 2020 13:26

danforbes force-pushed the runtime-storage branch from c03bf3c to 1fb7660 Compare April 28, 2020 20:53

danforbes changed the title ~~[WIP] Runtime storage~~ Runtime Storage Apr 29, 2020

danforbes added A0 - Please Review The PR is ready for review. K1 - Runtime Info about the runtime. K6 - Advanced/Other Doesn't fit other K-labels. T3 - Enhancement A current page needs more info. and removed A3 - In Progress Not ready for review yet. labels Apr 29, 2020

joepetrowski reviewed Apr 30, 2020

View reviewed changes

danforbes added 3 commits April 30, 2020 02:52

[WIP] Runtime storage

d014e25

Closes #5

Add section on calculating storage keys and using RPC to query storage

ab8aed1

Use 120-character line length

556b6db

Add section on best practices and address suggestions from @joepetrowski

a54712a

danforbes force-pushed the runtime-storage branch from b940ea5 to a54712a Compare April 30, 2020 14:26

joepetrowski reviewed Apr 30, 2020

View reviewed changes

Small clarifications

c95681c

kianenigma reviewed May 2, 2020

View reviewed changes

current/runtime/storage.md Outdated Show resolved Hide resolved

kianenigma reviewed May 2, 2020

View reviewed changes

Storage item is not a proper noun

89a31fa

danforbes force-pushed the runtime-storage branch from 15efcb8 to 5ae9197 Compare May 4, 2020 23:34

Use crates.parity.io for Rustdoc links

3b46952

danforbes force-pushed the runtime-storage branch 2 times, most recently from 2718bd1 to 5411956 Compare May 5, 2020 01:06

Genesis configuration

c14a42a

danforbes force-pushed the runtime-storage branch from 5411956 to c14a42a Compare May 5, 2020 01:52

danforbes added 2 commits May 5, 2020 09:08

Small enhancements per @kianenigma

05bc513

Non-transparent hashers are deprecated

ddef06d

gui1117 reviewed May 7, 2020

View reviewed changes

Dan Forbes and others added 3 commits May 7, 2020 19:07

Clarify generic storage types

4b4b065

Co-authored-by: thiolliere <[email protected]>

Clarifications per review by @thiolliere

ebdea8e

final nits

bb20b77

joepetrowski approved these changes May 8, 2020

View reviewed changes

joepetrowski added A9 - Buy That Person a Beer! Author deserves a beer and removed A0 - Please Review The PR is ready for review. labels May 8, 2020

joepetrowski merged commit a214b8b into master May 8, 2020

joepetrowski deleted the runtime-storage branch May 8, 2020 09:37


		### Storage Maps

		Storage Maps are implemented as hash maps, which is a pattern that should be familiar to most developers. In order to


		// all storage writes go here; no throwing code below this line

		// all event emissions go here

Runtime Storage #38

Runtime Storage #38

Conversation

danforbes commented Apr 9, 2020

danforbes commented Apr 15, 2020

gui1117 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danforbes commented Apr 19, 2020

joepetrowski left a comment

Choose a reason for hiding this comment

shawntabrizi commented Apr 22, 2020

danforbes commented Apr 28, 2020

danforbes commented Apr 30, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kianenigma left a comment

Choose a reason for hiding this comment

gui1117 commented May 4, 2020 • edited Loading

danforbes commented May 5, 2020

gui1117 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danforbes May 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joepetrowski left a comment

Choose a reason for hiding this comment

gui1117 commented May 4, 2020 •

edited

Loading

danforbes May 8, 2020 •

edited

Loading