slug | lang | title |
---|---|---|
storage |
en |
Runtime Storage |
Runtime storage allows you to store data in your blockchain that is persisted between blocks and can be accessed from within your runtime logic. Storage should be one of the most critical concerns of a blockchain runtime developer. This statement is somewhat self-evident, since one of the primary objectives of a blockchain is to provide decentralized consensus about the state of the underlying storage. Furthermore, well designed storage systems reduce the load on nodes in the network, which will lower the overhead for participants in your blockchain. Substrate exposes a set of layered, modular storage APIs that allow runtime developers to make the storage decisions that suit them best. However, the fundamental principle of blockchain runtime storage is to minimize its use. This document is intended to provide information and best practices about Substrate's runtime storage interfaces. Please refer to the advanced storage documentation for more information about how these interfaces are implemented.
The storage
module in
FRAME Support gives
runtime developers access to Substrate's flexible storage APIs. Any value that can be encoded by the
Parity SCALE codec is supported by these storage APIs:
- Storage Value - A single value
- Storage Map - A key-value hash map
- Storage Double Map - An implementation of a map with two keys that provides the important ability to efficiently remove all entries that have a common first key
The type of Storage Item you select should depend on the logical way in which the value will be used by your runtime.
This type of storage item should be used for values that are viewed as a single unit by the runtime,
whether that is a single primitive value, a single struct
, or a single collection of related
items. Although wrapping related items in a shared struct
is an excellent way to reduce the number
of storage reads (an important consideration), at some point the size of the object will begin to
incur costs that may outweigh the optimization in storage reads. Storage values can be used to store
lists of items, but runtime developers should take care with respect to the size of these lists.
Large lists incur storage costs just like large structs
. Furthermore, iterating over a large list
in your runtime may result in exceeding the block production time - if this occurs your blockchain
will stop producing blocks, which means that it will stop functioning.
Refer to the Storage Value documentation for a comprehensive list of the methods that Storage Values expose. Some of the most important methods are summarized here:
get()
- Load the value from storage.put(val)
- Store the provided value.mutate(fn)
- Mutate the value with the provided function.take()
- Load the value and remove it from storage.
Map data structures are ideal for managing sets of items whose elements will be accessed randomly, as opposed to iterating over them sequentially in their entirety. Storage Maps in Substrate are implemented as key-value hash maps, which is a pattern that should be familiar to most developers. In order to give blockchain engineers increased control, Substrate allows developers to select the hashing algorithm that is used to generate a map's keys. Refer to the advanced storage documentation to learn more about how Substrate's Storage Maps are implemented.
Storage Maps expose an API that is similar to that of Storage Values.
get
- Load the value associated with the provided key from storage. Docs:StorageMap#get(key)
,StorageDoubleMap#get(key1, key2)
insert
- Store the provided value by associating it with the given key. Docs:StorageMap#insert(key, val)
,StorageDoubleMap#insert(key1, key2, val)
mutate
- Use the provided function to mutate the value associated with the given key. Docs:StorageMap#mutate(key, fn)
,StorageDoubleMap#mutate(key1, key2, fn)
take
- Load the value associated with the given key and remove it from storage. Docs:StorageMap#take(key)
,StorageDoubleMap#take(key1, key2)
Depending on the hashing algorithm that you select to generate a map's keys, you may be able to iterate across its keys and values. Because maps are often used to track unbounded sets of data (account balances, for example) it is especially likely to exceed block production time by iterating over maps in their entirety within the runtime. Furthermore, because accessing the elements of a map requires more pointer dereferencing than accessing the elements of a native list, maps are significantly more costly than lists to iterate over with respect to time. This is not to say that it is "wrong" to iterate over maps in your runtime; in general Substrate focuses on "first principles" as opposed to hard and fast rules of right and wrong. Being efficient within the runtime of a blockchain is an important first principle of Substrate and this information is designed to help you understand all of Substrate's storage capabilities and use them in a way that respects the important first principles around which they were designed.
Substrate's Iterable Storage Map interfaces define the following methods. Note that for Iterable
Storage Double Maps, the iter
and drain
methods require a parameter, i.e. the first key:
iter
- Enumerate all elements in the map in no particular order. If you alter the map while doing this, you'll get undefined results. Docs:IterableStorageMap#iter()
,IterableStorageDoubleMap#iter(key1)
drain
- Remove all elements from the map and iterate through them in no particular order. If you add elements to the map while doing this, you'll get undefined results. Docs:IterableStorageMap#drain()
,IterableStorageDoubleMap#drain(key1)
translate
- Use the provided function to translate all elements of the map, in no particular order. To remove an element from the map, returnNone
from the translation function. Docs:IterableStorageMap#translate(fn)
,IterableStorageDoubleMap#translate(fn)
As mentioned above, a novel feature of Substrate Storage Maps is that they allow developers to specify the hashing algorithm that will be used when generating a map's keys. A Rust object that is used to encapsulate hashing logic is referred to as a "hasher". Broadly speaking, the hashers that are available to Substrate developers can be described in two ways: whether or not they are cryptographic and whether or not they produce output that is transparent.
Cryptographic hashing algorithms are those that use cryptography to make it challenging to use the input to the hashing algorithm to influence its output. For example, a cryptographic hashing algorithm would produce a wide distribution of outputs even if the inputs were the numbers 1 through 10. It is critical to use cryptographic hashing algorithms when users are able to influence the keys of a Storage Map. Failure to do so creates an attack vector that makes it easy for malicious actors to degrade the performance of your blockchain network. An example of a map that should use a cryptographic hash algorithm to generate its keys is a map used to track account balances. In this case, it is important to use a cryptographic hashing algorithm so that an attacker cannot bombard your system with many small transfers to sequential account numbers; without a cryptographic hash algorithm this would create an imbalanced storage structure that would suffer in performance. Cryptographic hashing algorithms are more complex and resource-intensive than their non-cryptographic counterparts, which is why Substrate allows developers to select when they are used.
A transparent hashing algorithm is one that makes it easy to discover and verify the input that was used to generate a given output. In Substrate, hashing algorithms are made transparent by concatenating the algorithm's input to its output. This makes it trivial for users to retrieve a key's original unhashed value and verify it if they'd like (by re-hashing it). It is generally recommended to use transparent hashing algorithms for your runtime's Storage Maps. In fact, it is necessary to use a transparent hashing algorithm if you would like access iterable map capabilities.
This table lists some common hashers used in Substrate and denotes those that are cryptographic and those that are transparent:
Hasher | Cryptographic | Transparent |
---|---|---|
Blake2 128 | X | |
TwoX 128 | ||
Blake2 128 Concat | X | X |
TwoX 64 Concat | X | |
Identity |
The Identity hasher encapsulates a hashing algorithm that has an output equal to its input (the identity function). This type of hasher should only be used when the starting key is already a cryptographic hash.
You can use
the decl_storage
macro
to easily create new runtime storage items. Here is an example of what it looks like to declare each
type of storage item:
decl_storage! {
trait Store for Module<T: Trait> as Example {
SomePrivateValue: u32;
pub SomePrimitiveValue get(fn some_primitive_value): u32;
// complex types are prefaced by T::
pub SomeComplexValue: T::AccountId;
pub SomeMap get(fn some_map): map hasher(blake2_128_concat) T::AccountId => u32;
pub SomeDoubleMap: double_map hasher(blake2_128_concat) u32, hasher(blake2_128_concat) T::AccountId => u32;
}
}
Notice that the map storage items specify the hashing algorithm that will be used.
In the example above, all the storage items except SomePrivateValue
are made public by way of the
pub
keyword. Blockchain storage is always publicly
visible from outside of the runtime; the visibility of Substrate
storage items only impacts whether or not other runtime pallets will be able to access the storage
item.
The decl_storage
macro provides an optional get
extension that can be used to implement a getter
method for a storage item on the module that contains that storage item; the extension takes the
desired name of the getter function as an argument. If you omit this optional extension, you will
still be able to access the storage item's value, but you will not be able to do so by way of a
getter method implemented on the module; instead, you will need to need to use
the storage item's get
method. Keep in mind that the optional get
extension only
impacts the way that the storage item can be accessed from within Substrate code; you will always be
able to query the storage of your runtime to get the value
of a storage item.
Here is an example that implements a getter method named some_value
for a Storage Value named
SomeValue
. This module would now have access to a Self::some_value()
method in addition to the
SomeValue::get()
method:
decl_storage! {
trait Store for Module<T: Trait> as Example {
pub SomeValue get(fn some_value): u64;
}
}
Substrate allows you to specify a default value that is returned when a storage item's value is not set. The default value does not actually occupy runtime storage, but runtime logic will see this value during execution.
Here is an example of specifying the default value for all items in a map:
decl_storage! {
trait Store for Module<T: Trait> as Example {
pub SomeMap: map u64 => u64 = 1337;
}
}
You can define
an optional GenesisConfig
struct in order to initialize Storage Items in the genesis block of your blockchain.
// TODO
Blockchains that are built with Substrate expose a remote procedure call (RPC) server that can be used to query your blockchain's runtime storage. You can use software libraries like Polkadot JS to easily interact with the RPC server from your code and access storage items. The Polkadot JS team also maintains the Polkadot Apps UI, which is a fully-featured web app for interacting with Substrate-based blockchains, including querying storage. Refer to the advanced storage documentation to learn more about how Substrate uses a key-value database to implement the different kinds of Storage Items and how to query this database directly by way of the RPC server.
Substrate's goal is to provide a flexible framework that allows people to build the blockchain that suits their needs - the creators of Substrate tend not to think in terms of "right" or "wrong". That being said, the Substrate codebase adheres to a number of best practices in order to promote the creation of blockchain networks that are secure, performant and maintainable in the long-term. The following sections outline best practices for using Substrate storage and also describe the important first principles that motivated them.
Remember, the fundamental principle of blockchain runtime storage is to minimize its use. Only
consensus-critical data should be stored in your runtime. When possible, use techniques like
hashing to reduce the amount of data you must store. For instance, many of Substrate's governance
capabilities (e.g.
the Democracy pallet's propose
dispatchable)
allow network participants to vote on the hash of a dispatchable call, which is always bounded in
size, as opposed to the call itself, which may be unbounded in length. This is especially true in
the case of runtime upgrades where the dispatchable call takes an entire runtime WASM blob as its
parameter. Because these governance mechanisms are implemented on-chain, all the information that
is needed to come to consensus on the state of a given proposal must also be stored on-chain - this
includes what is being voted on. However, by binding an on-chain proposal to its hash, Substrate's
governance mechanisms allow this to be done in a way that defers bringing all the data associated
with a proposal on-chain until after it has been approved. This means that storage is not wasted
on proposals that fail. Once a proposal has passed, someone can initiate the actual dispatchable
call (including all its parameters), which will be hashed and compared to the hash in the proposal.
Another common pattern for using hashes to minimize data that is stored on-chain is to store the
metadata associated with an object in IPFS; this means that only the IPFS
location (a hash that is bounded in size) needs to be stored on-chain.
Hashes are only one mechanism that can be used to control the size of runtime storage. An example of another mechanism is bounds.
The state of a blockchain network's storage is immutable; data can be changed, but there will always be a record of these changes, and making them typically incurs costs. Because of this, it is important that data is only persisted to runtime storage when it is certain that all preconditions have been met. In general, code blocks that may result in adding data to storage should be structured as follows:
{
// all checks and throwing code go here
// all storage writes go here; no throwing code below this line
// all event emissions go here
}
Do not use runtime storage to store intermediate or transient data within the context of an operation that is logically atomic or data that will not be needed if the operation is to fail. This does not mean that runtime storage should not be used to track the state of ongoing actions that require multiple atomic operations, as in the case of the multi-signature capabilities from the Utility pallet. In this case, runtime storage is used to track the signatories on a dispatchable call even though a given call may never receive enough signatures to actually be invoked. In this case, each signature is considered an atomic event in the ongoing multi-signature operation; the data needed to record a single signature is not stored until after all the preconditions associated with that signature have been met.
Creating bounds on the size of storage items is an extremely effective way to control the use of runtime storage and one that is used repeatedly throughout the Substrate codebase. In general, any storage item whose size is determined by user action should have a bound on it. The multi-signature capabilities from the Utility pallet that were described above are one such example. In this case, the list of signatories associated with a multi-signature operation is provided by the multi-signature participants. Because this signatory list is necessary to come to consensus on the state of the multi-signature operation, it must be stored in the runtime. However, in order to give runtime developers control over how much space in storage these lists may occupy, the Utility pallet requires users to configure a bound on this number that will be included as a precondition before anything is written to storage.
TODO
TODO
Read the advanced storage documentation.
Check out the Substrate Recipes section on storage.
-
Visit the reference docs for the
decl_storage!
macro for more details about the available storage declarations. -
Visit the reference docs for StorageValue, StorageMap and StorageDoubleMap to learn more about their APIs.