From a214b8babec563e2746bc3dc0b2435cffdccc3cd Mon Sep 17 00:00:00 2001 From: Dan Forbes Date: Fri, 8 May 2020 02:37:45 -0700 Subject: [PATCH] Runtime Storage (#38) * [WIP] Runtime storage Closes #5 * Add section on calculating storage keys and using RPC to query storage * Use 120-character line length * Document methods for Storage Items * Small changes to runtime storage document * Address @thiolliere PR comments * Deemphasize redudancy with Rust docs by collapsing sections about Storage Item APIs * Distinguish APIs for StorageMap and StorageDoubleMap * Clarify StorageMap implementation * Address @thiolliere PR comments * Clarify Storage Map implementation * Clarify iterable storage map principals * Formatting * Clarify the `get` extension to `decl_storage` * Clarifications and improvements from @joepetrowski * Remove iterable storage maps from top-level listing of storage items Per suggestion of @thiolliere and @shawntabrizi * Rework Iterable Storage Map documentation * Structural changes per @shawntabrizi * Structural changes per @shawntabrizi (advanced storage) * Small changes for readability and elaboration * Prettier * Add section on best practices and address suggestions from @joepetrowski * Small clarifications * Storage item is not a proper noun * Use crates.parity.io for Rustdoc links * Genesis configuration * Small enhancements per @kianenigma * Non-transparent hashers are deprecated * Clarify generic storage types Co-authored-by: thiolliere * Clarifications per review by @thiolliere * final nits Co-authored-by: thiolliere Co-authored-by: joepetrowski --- current/advanced/storage.md | 143 +++++++++- current/runtime/storage.md | 506 ++++++++++++++++++++++++++++++++---- 2 files changed, 591 insertions(+), 58 deletions(-) diff --git a/current/advanced/storage.md b/current/advanced/storage.md index 7ddfe4f..9479245 100644 --- a/current/advanced/storage.md +++ b/current/advanced/storage.md @@ -5,13 +5,16 @@ title: Storage --- Substrate uses a simple key-value data store implemented as a database-backed, modified Merkle tree. +All of Substrate's [higher-lever storage abstractions](../runtime/storage) are built on top of this +simple key-value store. ## Key-Value Database Substrate implements its storage database with [RocksDB](https://rocksdb.org/), a persistent -key-value store for fast storage environments. +key-value store for fast storage environments. It also supports an experimental +[Parity DB](https://github.com/paritytech/parity-db). -This is used for all the components of Substrate that require persistent storage, such as: +The DB is used for all the components of Substrate that require persistent storage, such as: - Substrate clients - Substrate light-clients @@ -34,14 +37,13 @@ simply comparing their trie roots. Accessing trie data is costly. Each read operation takes O(log N) time, where N is the number of elements stored in the trie. To mitigate this, we use a key-value cache. -All trie nodes are stored in RocksDB and part of the trie state can get pruned, i.e. a key-value -pair can be deleted from the storage when it is out of pruning range for non-archive nodes. We do -not use [reference counting](http://en.wikipedia.org/wiki/Reference_counting) for performance -reasons. +All trie nodes are stored in the DB and part of the trie state can get pruned, i.e. a key-value pair +can be deleted from storage when it is out of pruning range for non-archive nodes. We do not use +[reference counting](http://en.wikipedia.org/wiki/Reference_counting) for performance reasons. ### State Trie -Substrate based chains have a single main trie, called the state trie, whose root hash is placed in +Substrate-based chains have a single main trie, called the state trie, whose root hash is placed in each block header. This is used to easily verify the state of the blockchain and provide a basis for light clients to verify proofs. @@ -64,11 +66,132 @@ can use to verify the specific content in that trie. Subsections of a trie do no root-hash-like representation that satisfy these needs automatically; thus a child trie is used instead. +## Querying Storage + +Blockchains that are built with Substrate expose a remote procedure call (RPC) server that can be +used to query runtime storage. When you use the Substrate RPC to access a storage item, you only +need to provide [the key](#Key-Value-Database) associated with that item. +[Substrate's runtime storage APIs](../runtime/storage) expose a number of storage item types; keep +reading to learn how to calculate storage keys for the different types of storage items. + +### Storage Value Keys + +To calculate the key for a simple [Storage Value](../runtime/storage#Storage-Value), take the +[TwoX 128 hash](https://github.com/Cyan4973/xxHash) of the name of the module that contains the +Storage Value and append to it the TwoX 128 hash of the name of the Storage Value itself. For +example, the [Sudo](https://substrate.dev/rustdocs/master/pallet_sudo/index.html) pallet exposes a +Storage Value item named +[`Key`](https://substrate.dev/rustdocs/master/pallet_sudo/struct.Module.html#method.key): + +``` +twox_128("Sudo") = "0x5c0d1176a568c1f92944340dbfed9e9c" +twox_128("Key) = "0x530ebca703c85910e7164cb7d1c9e47b" +twox_128("Sudo") + twox_128("Key") = "0x5c0d1176a568c1f92944340dbfed9e9c530ebca703c85910e7164cb7d1c9e47b" +``` + +If the familiar `Alice` account is the sudo user, an RPC request and response to read the Sudo +module's `Key` Storage Value could be represented as: + +``` +state_getStorage("0x5c0d1176a568c1f92944340dbfed9e9c530ebca703c85910e7164cb7d1c9e47b") = "0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d" +``` + +In this case, the value that is returned +(`"0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d"`) is Alice's +[SCALE](./codec)-encoded account ID (`5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY`). + +You may have noticed that the +[non-cryptographic](../runtime/storage#Cryptographic-Hashing-Algorithms) TwoX 128 hash algorithm is +used to generate Storage Value keys. This is because it is not necessary to pay the performance +costs associated with a cryptographic hash function since the input to the hash function (the names +of the module and storage item) are determined by the runtime developer and not by potentially +malicious users of your blockchain. + +### Storage Map Keys + +Like Storage Values, the keys for [Storage Maps](../runtime/storage#StorageMaps) are equal to the +TwoX 128 hash of the name of the module that contains the map prepended to the TwoX 128 hash of the +name of the Storage Map itself. To retrieve an element from a map, simply append the hash of the +desired map key to the storage key of the Storage Map. For maps with two keys (Storage Double Maps), +append the hash of the first map key followed by the hash of the second map key to the Storage +Double Map's storage key. Like Storage Values, Substrate will use the TwoX 128 hashing algorithm for +the module and Storage Map names, but you will need to make sure to use the correct +[hashing algorithm](../runtime/storage#Hashing-Algorithms) (the one that was declared in +[the `decl_storage` macro](../runtime/storage#Declaring-Storage-Items)) when determining the hashed +keys for the elements in a map. + +Here is an example that illustrates querying a Storage Map named `FreeBalance` from a module named +"Balances" for the balance of the familiar `Alice` account. In this example, the `FreeBalance` map +is using +[the transparent Blake2 128 Concat hashing algorithm](../runtime/storage#Transparent-Hashing-Algorithms): + +``` +twox_128("Balances) = "0xc2261276cc9d1f8598ea4b6a74b15c2f" +twox_128("FreeBalance") = "0x6482b9ade7bc6657aaca787ba1add3b4" +scale_encode("5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY") = "0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d" + +blake2_128_concat("0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d") = "0xde1e86a9a8c739864cf3cc5ec2bea59fd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d" + +state_getStorage("0xc2261276cc9d1f8598ea4b6a74b15c2f6482b9ade7bc6657aaca787ba1add3b4de1e86a9a8c739864cf3cc5ec2bea59fd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d") = "0x0000a0dec5adc9353600000000000000" +``` + +The value that is returned from the storage query (`"0x0000a0dec5adc9353600000000000000"` in the +example above) is the [SCALE](./codec)-encoded value of Alice's account balance +(`"1000000000000000000000"` in this example). Notice that before hashing Alice's account ID it has +to be SCALE-encoded. Also notice that the output of the `blake2_128_concat` function consists of 32 +hexadecimal characters followed by the function's input. This is because the Blake2 128 Concat is +[a transparent hashing algorithm](../runtime/storage#Transparent-Hashing-Algorithms). Although the +above example may make this characteristic seem superfluous, its utility becomes more apparent when +the goal is to iterate over the keys in a map (as opposed to retrieving the value associated with a +single key). The ability to iterate over the keys in a map is a common requirement in order to allow +_people_ to use the map in a way that seems natural (such as UIs): first, a user is presented with a +list of elements in the map, then, that user can select the element that they are interested in and +query the map for more details about that particular element. Here is another example that uses the +same example Storage Map (a map named `FreeBalances` that uses a Blake2 128 Concat hashing algorithm +in a module named "Balances") that will demonstrate using the Substrate RPC to query a Storage Map +for its list of keys via the `state_getKeys` RPC endpoint: + +``` +twox_128("Balances) = "0xc2261276cc9d1f8598ea4b6a74b15c2f" +twox_128("FreeBalance") = "0x6482b9ade7bc6657aaca787ba1add3b4" + +state_getKeys("0xc2261276cc9d1f8598ea4b6a74b15c2f6482b9ade7bc6657aaca787ba1add3b4") = [ + "0xc2261276cc9d1f8598ea4b6a74b15c2f6482b9ade7bc6657aaca787ba1add3b4de1e86a9a8c739864cf3cc5ec2bea59fd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d", + "0xc2261276cc9d1f8598ea4b6a74b15c2f6482b9ade7bc6657aaca787ba1add3b432a5935f6edc617ae178fef9eb1e211fbe5ddb1579b72e84524fc29e78609e3caf42e85aa118ebfe0b0ad404b5bdd25f", + ... +] +``` + +Each element in the list that is returned by the Substrate RPC's `state_getKeys` endpoint can be +directly used as input for the RPC's `state_getStorage` endpoint. In fact, the first element in the +example list above is equal to the input used for the `state_getStorage` query in the previous +example (the one used to find the balance for `Alice`). Because the map that these keys belong to +uses a transparent hashing algorithm to generate its keys, it is possible to determine the account +associated with the second element in the list. Notice that each element in the list is a +hexadecimal value that begins with the same 64 characters; this is because each list element +represents a key in the same map, and that map is identified by concatenating two TwoX 128 hashes, +each of which are 128-bits or 32 hexadecimal characters. After discarding this portion of the second +element in the list, you are left with +`0x32a5935f6edc617ae178fef9eb1e211fbe5ddb1579b72e84524fc29e78609e3caf42e85aa118ebfe0b0ad404b5bdd25f`. + +You saw in the previous example that this represents the Blake2 128 Concat hash of some +[SCALE](./codec)-encoded account ID. The Blake 128 Concat hashing algorithm consists of appending +(concatenating) the hashing algorithm's input to its Blake 128 hash. This means that the first 128 +bits (or 32 hexadecimal characters) of a Blake2 128 Concat hash represents a Blake2 128 hash, and +the remainder represents the value that was passed to the Blake 2 128 hashing algorithm. In this +example, after you remove the first 32 hexadecimal characters that represent the Blake2 128 hash +(i.e. `0x32a5935f6edc617ae178fef9eb1e211f`) what is left is the hexadecimal value +`0xbe5ddb1579b72e84524fc29e78609e3caf42e85aa118ebfe0b0ad404b5bdd25f`, which is a +[SCALE](./codec)-encoded account ID. Decoding this value yields the result +`5GNJqTPyNqANBkUVMN1LPPrxXnFouWXoe2wNSmmEoLctxiZY`, which is the account ID for the familiar +`Alice_Stash` account. + ## Runtime Storage API -The Substrate's [Support module](https://substrate.dev/rustdocs/master/frame_support/index.html) -provides utilities to generate unique, deterministic keys for your runtime module storage items. -These storage items are placed in the state trie and are accessible by querying the trie by key. +Substrate's [FRAME Support crate](https://substrate.dev/rustdocs/master/frame_support/index.html) +provides utilities for generating unique, deterministic keys for your runtime's storage items. These +storage items are placed in the [state trie](#Trie-Abstraction) and are accessible by +[querying the trie by key](#Querying-Storage). ## Next Steps diff --git a/current/runtime/storage.md b/current/runtime/storage.md index 89e87c3..6f4e71f 100644 --- a/current/runtime/storage.md +++ b/current/runtime/storage.md @@ -4,90 +4,500 @@ lang: en title: Runtime Storage --- -Runtime storage allows you to store data in your blockchain which can be accessed from your runtime -logic and persists between blocks. +Runtime storage allows you to store data in your blockchain that is persisted between blocks and can +be accessed from within your runtime logic. Storage should be one of the most critical concerns of a +blockchain runtime developer. This statement is somewhat self-evident, since one of the primary +objectives of a blockchain is to provide decentralized consensus about the state of the underlying +storage. Furthermore, well designed storage systems reduce the load on nodes in the network, which +will lower the overhead for participants in your blockchain. Substrate exposes a set of layered, +modular storage APIs that allow runtime developers to make the storage decisions that suit them +best. However, the fundamental principle of blockchain runtime storage is to minimize its use. This +document is intended to provide information and best practices about Substrate's runtime storage +interfaces. Please refer to [the advanced storage documentation](../advanced/storage) for more +information about how these interfaces are implemented. ## Storage Items -Your runtime module has access to Substrate storage APIs which allows you to easily store common -storage items: +The `storage` module in [FRAME Support](https://crates.parity.io/frame_support/storage/index.html) +gives runtime developers access to Substrate's flexible storage APIs. Any value that can be encoded +by the [Parity SCALE codec](../advanced/codec) is supported by these storage APIs: -- [Storage Value](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageValue.html) - - A single value. -- [Storage Map](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageMap.html) - - A key-value hash map. -- [Storage Linked Map](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageLinkedMap.html) - - Similar to a storage map, but allows enumeration of the stored elements. -- [Storage Double Map](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageDoubleMap.html) - - An implementation of a map with two keys. +- [Storage Value](https://crates.parity.io/frame_support/storage/trait.StorageValue.html) - A single + value +- [Storage Map](https://crates.parity.io/frame_support/storage/trait.StorageMap.html) - A key-value + hash map +- [Storage Double Map](https://crates.parity.io/frame_support/storage/trait.StorageDoubleMap.html) - + An implementation of a map with two keys that provides the important ability to efficiently remove + all entries that have a common first key -Any value which can be encoded by the [Parity SCALE codec](../advanced/codec) is supported by these -storage APIs. +The type of storage item you select should depend on the logical way in which the value will be used +by your runtime. -### Storage Declaration +### Storage Value -You can use the `decl_storage!` macro to easily create new runtime storage items. Here is an example -of what it looks like to declare each type of storage item: +This type of storage item should be used for values that are viewed as a single unit by the runtime, +whether that is a single primitive value, a single `struct`, or a single collection of related +items. Although wrapping related items in a shared `struct` is an excellent way to reduce the number +of storage reads (an important consideration), at some point the size of the object will begin to +incur costs that may outweigh the optimization in storage reads. Storage values can be used to store +lists of items, but runtime developers should take care with respect to the size of these lists. +Large lists incur storage costs just like large `structs`. Furthermore, iterating over a large list +in your runtime may result in exceeding the block production time - if this occurs your blockchain +will stop producing blocks, which means that it will stop functioning. + +#### Methods + +Refer to the Storage Value documentation for +[a comprehensive list of the methods that Storage Values expose](https://crates.parity.io/frame_support/storage/trait.StorageValue.html#required-methods). +Some of the most important methods are summarized here: + +- [`get()`](https://crates.parity.io/frame_support/storage/trait.StorageValue.html#tymethod.get) - + Load the value from storage. +- [`put(val)`](https://crates.parity.io/frame_support/storage/trait.StorageValue.html#tymethod.put) - + Store the provided value. +- [`mutate(fn)`](https://crates.parity.io/frame_support/storage/trait.StorageValue.html#tymethod.mutate) - + Mutate the value with the provided function. +- [`take()`](https://crates.parity.io/frame_support/storage/trait.StorageValue.html#tymethod.take) - + Load the value and remove it from storage. + +### Storage Maps + +Map data structures are ideal for managing sets of items whose elements will be accessed randomly, +as opposed to iterating over them sequentially in their entirety. Storage Maps in Substrate are +implemented as key-value hash maps, which is a pattern that should be familiar to most developers. +In order to give blockchain engineers increased control, Substrate allows developers to select +[the hashing algorithm](#Hashing-Algorithms) that is used to generate a map's keys. Refer to +[the advanced storage documentation](../advanced/storage) to learn more about how Substrate's +Storage Maps are implemented. + +#### Methods + +[Storage Maps expose an API](https://crates.parity.io/frame_support/storage/trait.StorageMap.html#required-methods) +that is similar to that of Storage Values. + +- `get` - Load the value associated with the provided key from storage. Docs: + [`StorageMap#get(key)`](https://crates.parity.io/frame_support/storage/trait.StorageMap.html#tymethod.get), + [`StorageDoubleMap#get(key1, key2)`](https://crates.parity.io/frame_support/storage/trait.StorageDoubleMap.html#tymethod.get) +- `insert` - Store the provided value by associating it with the given key. Docs: + [`StorageMap#insert(key, val)`](https://crates.parity.io/frame_support/storage/trait.StorageMap.html#tymethod.insert), + [`StorageDoubleMap#insert(key1, key2, val)`](https://crates.parity.io/frame_support/storage/trait.StorageDoubleMap.html#tymethod.insert) +- `mutate` - Use the provided function to mutate the value associated with the given key. Docs: + [`StorageMap#mutate(key, fn)`](https://crates.parity.io/frame_support/storage/trait.StorageMap.html#tymethod.mutate), + [`StorageDoubleMap#mutate(key1, key2, fn)`](https://crates.parity.io/frame_support/storage/trait.StorageDoubleMap.html#tymethod.mutate) +- `take` - Load the value associated with the given key and remove it from storage. Docs: + [`StorageMap#take(key)`](https://crates.parity.io/frame_support/storage/trait.StorageMap.html#tymethod.take), + [`StorageDoubleMap#take(key1, key2)`](https://crates.parity.io/frame_support/storage/trait.StorageDoubleMap.html#tymethod.take) + +#### Iterable Storage Maps + +Substrate Storage Maps are iterable with respect to their keys and values. Because maps are often +used to track unbounded sets of data (account balances, for example) it is especially likely to +exceed block production time by iterating over maps in their entirety within the runtime. +Furthermore, because accessing the elements of a map requires more database reads than accessing the +elements of a native list, maps are significantly _more_ costly than lists to iterate over with +respect to time. This is not to say that it is "wrong" to iterate over maps in your runtime; in +general Substrate focuses on "[first principles](#Best-Practices)" as opposed to hard and fast rules +of right and wrong. Being efficient within the runtime of a blockchain is an important first +principle of Substrate and this information is designed to help you understand _all_ of Substrate's +storage capabilities and use them in a way that respects the important first principles around which +they were designed. + +##### Iterable Storage Map Methods + +Substrate's Iterable Storage Map interfaces define the following methods. Note that for Iterable +Storage Double Maps, the `iter` and `drain` methods require a parameter, i.e. the first key: + +- `iter` - Enumerate all elements in the map in no particular order. If you alter the map while + doing this, you'll get undefined results. Docs: + [`IterableStorageMap#iter()`](https://crates.parity.io/frame_support/storage/trait.IterableStorageMap.html#tymethod.iter), + [`IterableStorageDoubleMap#iter(key1)`](https://crates.parity.io/frame_support/storage/trait.IterableStorageDoubleMap.html#tymethod.iter) +- `drain` - Remove all elements from the map and iterate through them in no particular order. If you + add elements to the map while doing this, you'll get undefined results. Docs: + [`IterableStorageMap#drain()`](https://crates.parity.io/frame_support/storage/trait.IterableStorageMap.html#tymethod.drain), + [`IterableStorageDoubleMap#drain(key1)`](https://crates.parity.io/frame_support/storage/trait.IterableStorageDoubleMap.html#tymethod.drain) +- `translate` - Use the provided function to translate all elements of the map, in no particular + order. To remove an element from the map, return `None` from the translation function. Docs: + [`IterableStorageMap#translate(fn)`](https://crates.parity.io/frame_support/storage/trait.IterableStorageMap.html#tymethod.translate), + [`IterableStorageDoubleMap#translate(fn)`](https://crates.parity.io/frame_support/storage/trait.IterableStorageDoubleMap.html#tymethod.translate) + +#### Hashing Algorithms + +As mentioned above, a novel feature of Substrate Storage Maps is that they allow developers to +specify the hashing algorithm that will be used to generate a map's keys. A Rust object that is used +to encapsulate hashing logic is referred to as a "hasher". Broadly speaking, the hashers that are +available to Substrate developers can be described in two ways: whether or not they are +cryptographic and whether or not they produce output that is transparent. For the sake of +completeness, the characteristics of non-transparent hashing algorithms are described below, but +keep in mind that any hasher that does not produce transparent output has been deprecated for use +within FRAME-based blockchains. + +##### Cryptographic Hashing Algorithms + +Cryptographic hashing algorithms are those that use cryptography to make it challenging to use the +input to the hashing algorithm to influence its output. For example, a cryptographic hashing +algorithm would produce a wide distribution of outputs even if the inputs were the numbers 1 +through 10. It is critical to use cryptographic hashing algorithms when users are able to influence +the keys of a Storage Map. Failure to do so creates an attack vector that makes it easy for +malicious actors to degrade the performance of your blockchain network. An example of a map that +should use a cryptographic hash algorithm to generate its keys is a map used to track account +balances. In this case, it is important to use a cryptographic hashing algorithm so that an attacker +cannot bombard your system with many small transfers to sequential account numbers; without a +cryptographic hash algorithm this would create an imbalanced storage structure that would suffer in +performance. Cryptographic hashing algorithms are more complex and resource-intensive than their +non-cryptographic counterparts, which is why Substrate allows developers to select when they are +used. + +##### Transparent Hashing Algorithms + +A transparent hashing algorithm is one that makes it easy to discover and verify the input that was +used to generate a given output. In Substrate, hashing algorithms are made transparent by +concatenating the algorithm's input to its output. This makes it trivial for users to retrieve a +key's original unhashed value and verify it if they'd like (by re-hashing it). The creators of +Substrate have **deprecated the use of non-transparent hashers** within FRAME-based runtimes, so +this information is provided primarily for completeness. In fact, it is _necessary_ to use a +transparent hashing algorithm if you would like to access [iterable map](#Iterable-Storage-Maps) +capabilities. Refer to [the advanced storage documentation](../advanced/storage#Storage-Map-Keys) to +learn more about the important capabilities that transparent hashing algorithms expose. + +##### Common Substrate Hashers + +This table lists some common hashers used in Substrate and denotes those that are cryptographic and +those that are transparent: + +| Hasher | Cryptographic | Transparent | +| ------------------------------------------------------------------------------------------ | ------------- | ----------- | +| [Blake2 128 Concat](https://crates.parity.io/frame_support/struct.Blake2_128Concat.html) | X | X | +| [TwoX 64 Concat](https://crates.parity.io/frame_support/struct.Twox64Concat.html) | | X | +| [Identity](https://crates.parity.io/frame_support/struct.Identity.html) | | | +| [Blake2 128](https://crates.parity.io/frame_support/struct.Blake2_128.html) **DEPRECATED** | X | | +| [TwoX 128](https://crates.parity.io/frame_support/struct.Twox128.html) **DEPRECATED** | | | + +The Identity hasher encapsulates a hashing algorithm that has an output equal to its input (the +identity function). This type of hasher should only be used when the starting key is already a +cryptographic hash. + +## Declaring Storage Items + +You can use +[the `decl_storage` macro](https://crates.parity.io/frame_support/macro.decl_storage.html) to easily +create new runtime storage items. Here is an example of what it looks like to declare each type of +storage item: + +```rust +decl_storage! { + trait Store for Module as Example { + SomePrivateValue: u32; + pub SomePrimitiveValue get(fn some_primitive_value): u32; + // types can make use of the generic `T: Trait` + pub SomeComplexValue: T::AccountId; + pub SomeMap get(fn some_map): map hasher(blake2_128_concat) T::AccountId => u32; + pub SomeDoubleMap: double_map hasher(blake2_128_concat) u32, hasher(blake2_128_concat) T::AccountId => u32; + } +} +``` + +Notice that the map's storage items specify [the hashing algorithm](#Hashing-Algorithms) that will +be used. + +### Visibility + +In the example above, all the storage items except `SomePrivateValue` are made public by way of the +`pub` keyword. Blockchain storage is always publicly +[visible from _outside_ of the runtime](#Accessing-Storage-Items); the visibility of Substrate +storage items only impacts whether or not other pallets _within_ the runtime will be able to access +a storage item. + +### Getter Methods + +The `decl_storage` macro provides an optional `get` extension that can be used to implement a getter +method for a storage item on the module that contains that storage item; the extension takes the +desired name of the getter function as an argument. If you omit this optional extension, you will +still be able to access the storage item's value, but you will not be able to do so by way of a +getter method implemented on the module; instead, you will need to need to use +[the storage item's `get` method](#Methods). Keep in mind that the optional `get` extension only +impacts the way that the storage item can be accessed from within Substrate code; you will always be +able to [query the storage of your runtime](../advanced/storage#Querying-Storage) to get the value +of a storage item. + +Here is an example that implements a getter method named `some_value` for a Storage Value named +`SomeValue`. This module would now have access to a `Self::some_value()` method in addition to the +`SomeValue::get()` method: + +```rust +decl_storage! { + trait Store for Module as Example { + pub SomeValue get(fn some_value): u64; + } +} +``` + +### Default Values + +Substrate allows you to specify a default value that is returned when a storage item's value is not +set. The default value does **not** actually occupy runtime storage, but runtime logic will see this +value during execution. + +Here is an example of specifying the default value for all items in a map: + +```rust +decl_storage! { + trait Store for Module as Example { + pub SomeMap: map u64 => u64 = 1337; + } +} +``` + +### Genesis Configuration + +Substrate's runtime storage APIs include capabilities to initialize storage items in the genesis +block of your blockchain. The genesis storage configuration APIs expose a number of mechanisms for +initializing storage, all of which have entry points in the `decl_storage` macro. These mechanisms +all result in the creation of a `GenesisConfig` data type that implements +[the `BuildModuleGenesisStorage` trait](https://crates.parity.io/sp_runtime/trait.BuildModuleGenesisStorage.html) +and will be added to the module that contains the storage items (e.g. +[`Struct pallet_balances::GenesisConfig`](https://crates.parity.io/pallet_balances/struct.GenesisConfig.html)); +storage items that are tagged for genesis configuration will have a corresponding attribute on this +data type. In order to consume a module's genesis configuration capabilities, you must include the +`Config` element when adding the module to your runtime with +[the `construct_runtime` macro](https://crates.parity.io/frame_support/macro.construct_runtime.html). +All the `GenesisConfig` types for the modules that inform a runtime will be aggregated into a single +`GenesisConfig` type for that runtime, which implements +[the `BuildStorage` trait](https://crates.parity.io/sp_runtime/trait.BuildStorage.html) (e.g. +[`Struct node_template_runtime::GenesisConfig`](https://crates.parity.io/node_template_runtime/struct.GenesisConfig.html)); +each attribute on this type corresponds to a `GenesisConfig` from one of the runtime's modules. +Ultimately, the runtime's `GenesisConfig` is exposed by way of +[the `ChainSpec` trait](https://crates.parity.io/sc_chain_spec/trait.ChainSpec.html). For a complete +and concrete example of using Substrate's genesis storage configuration capabilities, refer to the +`decl_storage` macro in +[the Society pallet](https://github.com/paritytech/substrate/blob/master/frame/society/src/lib.rs) +as well as the genesis configuration for the Society pallet's storage in +[the chain specification that ships with the Substrate code base](https://github.com/paritytech/substrate/blob/master/bin/node/cli/src/chain_spec.rs). +Keep reading for more detailed descriptions of these capabilities. + +#### `config` + +When you use the `decl_storage` macro to declare a storage item, you can provide an optional +`config` extension that will add an attribute to the pallet's `GenesisConfig` data type; the value +of this attribute will be used as the initial value of the storage item in your chain's genesis +block. The `config` extension takes a parameter that will determine the name of the attribute on the +`GenesisConfig` data type; this parameter is optional if [the `get` extension](#Getter-Methods) is +provided (the name of the `get` function is used as the attribute's name). + +Here is an example that demonstrates using the `config` extension with a Storage Value named `MyVal` +to create an attribute named `init_val` on the `GenesisConfig` data type for the Storage Value's +module. This attribute is then used in an example that demonstrates using the `GenesisConfig` types +to set the Storage Value's initial value in your chain's genesis block. + +In `my_module/src/lib.rs`: ```rust decl_storage! { - trait Store for Module as Example { - pub SomeValue: u64; - pub SomeMap: map u64 => u64; - pub SomeLinkedMap: linked_map u64 => u64; - pub SomeDoubleMap: double_map u64, blake2_256(u64) => u64; - } + trait Store for Module as MyModule { + pub MyVal get(fn my_val) config(init_val): u64; + } } ``` -## Default Value +In `chain_spec.rs`: -Substrate allows you to define the default value which is returned when the storage item is not set. -This value is **not** actually stored in the runtime storage, but runtime logic will see this value -during execution. +```rust +GenesisConfig { + my_module: Some(MyModuleConfig { + init_val: 221u64 + SOME_CONSTANT_VALUE, + }), +} +``` -Here is an example for setting a default value for all items in a map: +#### `build` + +Whereas [the `config` extension](#config) to the `decl_storage` macro allows you to configure a +module's genesis storage state within a chain specification, the `build` extension allows you to +perform this same task within the module itself (this gives you access to the module's private +functions). Like `config`, the `build` extension accepts a single parameter, but in this case the +parameter is always required and must be a closure, which is essentially a function. The `build` +closure will be invoked with a single parameter whose type will be the pallet's `GenesisConfig` type +(this gives you easy access to all the attributes of the `GenesisConfig` type). You may use the +`build` extension along with the `config` extension for a single storage item; in this case, the +pallet's `GenesisConfig` type will have an attribute that corresponds to what was set using `config` +whose value will be set in the chain specification, but it will be the value returned by the `build` +closure that will be used to set the storage item's genesis value. + +Here is an example that demonstrates using `build` to set the initial value of a storage item. In +this case, the example involves two storage items: one that represents a list of member account IDs +and another that designates a special member from the list, the prime member. The list of members is +provided by way of the `config` extension and the prime member, who is assumed to be the first +element in the list of members, is set using the `build` extension. + +In `my_module/src/lib.rs`: ```rust decl_storage! { - trait Store for Module as Example { - pub SomeMap: map u64 => u64 = 1337; - } + trait Store for Module as MyModule { + pub Members config(orig_ids): Vec; + pub Prime build(|config: &GenesisConfig| config.orig_ids.first().cloned()): T::AccountId; + } +} +``` + +In `chain_spec.rs`: + +```rust +GenesisConfig { + my_module: Some(MyModuleConfig { + orig_ids: LIST_OF_IDS, + }), +} +``` + +#### `add_extra_genesis` + +The `add_extra_genesis` extension to the `decl_storage` macro allows you to define a scope where the +[`config`](#config) and [`build`](#build) extensions can be provided without the need to bind them +to specific storage items. You can use `config` within an `add_extra_genesis` scope to add an +attribute to the pallet's `GenesisConfig` data type that can be used within any `build` closure. The +`build` closures that are defined within an `add_extra_genesis` scope can be used to execute logic +without binding that logic's return value to the value of a particular storage item; this may be +desireable if you wish to invoke a private helper function within your module that sets several +storage items or invoke a function defined on some other module included within your module. + +Here is an example that encapsulates the same use case described above in the example for `build`: a +module that maintains a list of member account IDs along with a designated prime member. In this +case, however, the `add_extra_genesis` extension is used to define a `GenesisConfig` attribute that +is not bound to particular storage item as well as a `build` closure that will call a private +function on the module to set the values of multiple storage items. For the purposes of this +example, the implementation of the private helper function (`initialize_members`) is left to your +imagination. + +In `my_module/src/lib.rs`: + +```js +decl_storage! { + trait Store for Module as MyModule { + pub Members: Vec; + pub Prime: T::AccountId; + } + add_extra_genesis { + config(orig_ids): Vec; + build(|config| Module::::initialize_members(&config.members)) + } +} +``` + +In `chain_spec.rs`: + +```rust +GenesisConfig { + my_module: Some(MyModuleConfig { + orig_ids: LIST_OF_IDS, + }), } ``` -## Verify First, Write Last +## Accessing Storage Items + +Blockchains that are built with Substrate expose a remote procedure call (RPC) server that can be +used to query your blockchain's runtime storage. You can use software libraries like +[Polkadot JS](https://polkadot.js.org/) to easily interact with the RPC server from your code and +access storage items. The Polkadot JS team also maintains +[the Polkadot Apps UI](https://polkadot.js.org/apps), which is a fully-featured web app for +interacting with Substrate-based blockchains, including querying storage. Refer to +[the advanced storage documentation](../advanced/storage) to learn more about how Substrate uses a +key-value database to implement the different kinds of storage items and how to query this database +directly by way of the RPC server. + +## Best Practices -TODO +Substrate's goal is to provide a flexible framework that allows people to build the blockchain that +suits their needs - the creators of Substrate tend not to think in terms of "right" or "wrong". That +being said, the Substrate codebase adheres to a number of best practices in order to promote the +creation of blockchain networks that are secure, performant, and maintainable in the long-term. The +following sections outline best practices for using Substrate storage and also describe the +important first principles that motivated them. -## Storage Cache +### What to Store -TODO +Remember, the fundamental principle of blockchain runtime storage is to minimize its use. Only +_consensus-critical_ data should be stored in your runtime. When possible, use techniques like +hashing to reduce the amount of data you must store. For instance, many of Substrate's governance +capabilities (e.g. +[the Democracy pallet's `propose` dispatchable](https://crates.parity.io/pallet_democracy/enum.Call.html#variant.propose)) +allow network participants to vote on the _hash_ of a dispatchable call, which is always bounded in +size, as opposed to the call itself, which may be unbounded in length. This is especially true in +the case of runtime upgrades where the dispatchable call takes an entire runtime Wasm blob as its +parameter. Because these governance mechanisms are implemented _on-chain_, all the information that +is needed to come to consensus on the state of a given proposal must also be stored on-chain - this +includes _what_ is being voted on. However, by binding an on-chain proposal to its hash, Substrate's +governance mechanisms allow this to be done in a way that defers bringing all the data associated +with a proposal on-chain until _after_ it has been approved. This means that storage is not wasted +on proposals that fail. Once a proposal has passed, someone can initiate the actual dispatchable +call (including all its parameters), which will be hashed and compared to the hash in the proposal. +Another common pattern for using hashes to minimize data that is stored on-chain is to store the +pre-image associated with an object in [IPFS](https://ipfs.io/); this means that only the IPFS +location (a hash that is bounded in size) needs to be stored on-chain. -## Child Storage Tries +Hashes are only one mechanism that can be used to control the size of runtime storage. An example of +another mechanism is [bounds](#Create-Bounds). -TODO +### Verify First, Write Last + +Substrate does not cache state prior to extrinsic dispatch. Instead, it applies changes directly as +they are invoked. If an extrinsic fails, any state changes will persist. Because of this, it is +important not to make any storage mutations until it is certain that all preconditions have been +met. In general, code blocks that may result in mutating storage should be structured as follows: + +```rust +{ + // all checks and throwing code go here + + // ** no throwing code below this line ** + + // all event emissions & storage writes go here +} +``` + +Do not use runtime storage to store intermediate or transient data within the context of an +operation that is logically atomic or data that will not be needed if the operation is to fail. This +does not mean that runtime storage should not be used to track the state of ongoing actions that +require multiple atomic operations, as in the case of +[the multi-signature capabilities from the Utility pallet](https://crates.parity.io/pallet_utility/enum.Call.html#variant.as_multi). +In this case, runtime storage is used to track the signatories on a dispatchable call even though a +given call may never receive enough signatures to actually be invoked. In this case, each signature +is considered an atomic event in the ongoing multi-signature operation; the data needed to record a +single signature is not stored until after all the preconditions associated with that signature have +been met. + +### Create Bounds + +Creating bounds on the size of storage items is an extremely effective way to control the use of +runtime storage and one that is used repeatedly throughout the Substrate codebase. In general, any +storage item whose size is determined by user action should have a bound on it. +[The multi-signature capabilities from the Utility pallet](https://crates.parity.io/pallet_utility/trait.Trait.html#associatedtype.MaxSignatories) +that were described above are one such example. In this case, the list of signatories associated +with a multi-signature operation is provided by the multi-signature participants. Because this +signatory list is [necessary to come to consensus](#What-to-Store) on the state of the +multi-signature operation, it must be stored in the runtime. However, in order to give runtime +developers control over how much space in storage these lists may occupy, the Utility pallet +requires users to configure a bound on this number that will be included as a +[precondition](#Verify-First-Write-Last) before anything is written to storage. ## Next Steps ### Learn More -TODO +Read [the advanced storage documentation](../advanced/storage). ### Examples -- View this example to see how you can use a `double_map` to act as a `killable` single-map. +Check out +[the Substrate Recipes section on storage](https://substrate.dev/recipes/3-entrees/storage-api/index.html). ### References - Visit the reference docs for the - [`decl_storage!` macro](https://substrate.dev/rustdocs/master/frame_support/macro.decl_storage.html) - more details possible storage declarations. - + [`decl_storage!` macro](https://crates.parity.io/frame_support/macro.decl_storage.html) for more + details about the available storage declarations. - Visit the reference docs for - [StorageValue](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageValue.html), - [StorageMap](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageMap.html), - [StorageLinkedMap](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageLinkedMap.html), - and - [StorageDoubleMap](https://substrate.dev/rustdocs/master/frame_support/storage/trait.StorageDoubleMap.html) - to learn more about their API. + [StorageValue](https://crates.parity.io/frame_support/storage/trait.StorageValue.html), + [StorageMap](https://crates.parity.io/frame_support/storage/trait.StorageMap.html) and + [StorageDoubleMap](https://crates.parity.io/frame_support/storage/trait.StorageDoubleMap.html) to + learn more about their APIs.