Skip to content
This repository has been archived by the owner on Aug 19, 2020. It is now read-only.

Commit

Permalink
Runtime Storage (#38)
Browse files Browse the repository at this point in the history
* [WIP] Runtime storage

Closes #5

* Add section on calculating storage keys and using RPC to query storage

* Use 120-character line length

* Document methods for Storage Items

* Small changes to runtime storage document

* Address @thiolliere PR comments

* Deemphasize redudancy with Rust docs by collapsing sections about Storage Item APIs
* Distinguish APIs for StorageMap and StorageDoubleMap
* Clarify StorageMap implementation

* Address @thiolliere PR comments

* Clarify Storage Map implementation
* Clarify iterable storage map principals

* Formatting

* Clarify the `get` extension to `decl_storage`

* Clarifications and improvements from @joepetrowski

* Remove iterable storage maps from top-level listing of storage items

Per suggestion of @thiolliere and @shawntabrizi

* Rework Iterable Storage Map documentation

* Structural changes per @shawntabrizi

* Structural changes per @shawntabrizi (advanced storage)

* Small changes for readability and elaboration

* Prettier

* Add section on best practices and address suggestions from @joepetrowski

* Small clarifications

* Storage item is not a proper noun

* Use crates.parity.io for Rustdoc links

* Genesis configuration

* Small enhancements per @kianenigma

* Non-transparent hashers are deprecated

* Clarify generic storage types

Co-authored-by: thiolliere <[email protected]>

* Clarifications per review by @thiolliere

* final nits

Co-authored-by: thiolliere <[email protected]>
Co-authored-by: joepetrowski <[email protected]>
  • Loading branch information
3 people authored May 8, 2020
1 parent 4fc4c59 commit a214b8b
Show file tree
Hide file tree
Showing 2 changed files with 591 additions and 58 deletions.
143 changes: 133 additions & 10 deletions current/advanced/storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,16 @@ title: Storage
---

Substrate uses a simple key-value data store implemented as a database-backed, modified Merkle tree.
All of Substrate's [higher-lever storage abstractions](../runtime/storage) are built on top of this
simple key-value store.

## Key-Value Database

Substrate implements its storage database with [RocksDB](https://rocksdb.org/), a persistent
key-value store for fast storage environments.
key-value store for fast storage environments. It also supports an experimental
[Parity DB](https://github.com/paritytech/parity-db).

This is used for all the components of Substrate that require persistent storage, such as:
The DB is used for all the components of Substrate that require persistent storage, such as:

- Substrate clients
- Substrate light-clients
Expand All @@ -34,14 +37,13 @@ simply comparing their trie roots.
Accessing trie data is costly. Each read operation takes O(log N) time, where N is the number of
elements stored in the trie. To mitigate this, we use a key-value cache.

All trie nodes are stored in RocksDB and part of the trie state can get pruned, i.e. a key-value
pair can be deleted from the storage when it is out of pruning range for non-archive nodes. We do
not use [reference counting](http://en.wikipedia.org/wiki/Reference_counting) for performance
reasons.
All trie nodes are stored in the DB and part of the trie state can get pruned, i.e. a key-value pair
can be deleted from storage when it is out of pruning range for non-archive nodes. We do not use
[reference counting](http://en.wikipedia.org/wiki/Reference_counting) for performance reasons.

### State Trie

Substrate based chains have a single main trie, called the state trie, whose root hash is placed in
Substrate-based chains have a single main trie, called the state trie, whose root hash is placed in
each block header. This is used to easily verify the state of the blockchain and provide a basis for
light clients to verify proofs.

Expand All @@ -64,11 +66,132 @@ can use to verify the specific content in that trie. Subsections of a trie do no
root-hash-like representation that satisfy these needs automatically; thus a child trie is used
instead.

## Querying Storage

Blockchains that are built with Substrate expose a remote procedure call (RPC) server that can be
used to query runtime storage. When you use the Substrate RPC to access a storage item, you only
need to provide [the key](#Key-Value-Database) associated with that item.
[Substrate's runtime storage APIs](../runtime/storage) expose a number of storage item types; keep
reading to learn how to calculate storage keys for the different types of storage items.

### Storage Value Keys

To calculate the key for a simple [Storage Value](../runtime/storage#Storage-Value), take the
[TwoX 128 hash](https://github.com/Cyan4973/xxHash) of the name of the module that contains the
Storage Value and append to it the TwoX 128 hash of the name of the Storage Value itself. For
example, the [Sudo](https://substrate.dev/rustdocs/master/pallet_sudo/index.html) pallet exposes a
Storage Value item named
[`Key`](https://substrate.dev/rustdocs/master/pallet_sudo/struct.Module.html#method.key):

```
twox_128("Sudo") = "0x5c0d1176a568c1f92944340dbfed9e9c"
twox_128("Key) = "0x530ebca703c85910e7164cb7d1c9e47b"
twox_128("Sudo") + twox_128("Key") = "0x5c0d1176a568c1f92944340dbfed9e9c530ebca703c85910e7164cb7d1c9e47b"
```

If the familiar `Alice` account is the sudo user, an RPC request and response to read the Sudo
module's `Key` Storage Value could be represented as:

```
state_getStorage("0x5c0d1176a568c1f92944340dbfed9e9c530ebca703c85910e7164cb7d1c9e47b") = "0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d"
```

In this case, the value that is returned
(`"0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d"`) is Alice's
[SCALE](./codec)-encoded account ID (`5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY`).

You may have noticed that the
[non-cryptographic](../runtime/storage#Cryptographic-Hashing-Algorithms) TwoX 128 hash algorithm is
used to generate Storage Value keys. This is because it is not necessary to pay the performance
costs associated with a cryptographic hash function since the input to the hash function (the names
of the module and storage item) are determined by the runtime developer and not by potentially
malicious users of your blockchain.

### Storage Map Keys

Like Storage Values, the keys for [Storage Maps](../runtime/storage#StorageMaps) are equal to the
TwoX 128 hash of the name of the module that contains the map prepended to the TwoX 128 hash of the
name of the Storage Map itself. To retrieve an element from a map, simply append the hash of the
desired map key to the storage key of the Storage Map. For maps with two keys (Storage Double Maps),
append the hash of the first map key followed by the hash of the second map key to the Storage
Double Map's storage key. Like Storage Values, Substrate will use the TwoX 128 hashing algorithm for
the module and Storage Map names, but you will need to make sure to use the correct
[hashing algorithm](../runtime/storage#Hashing-Algorithms) (the one that was declared in
[the `decl_storage` macro](../runtime/storage#Declaring-Storage-Items)) when determining the hashed
keys for the elements in a map.

Here is an example that illustrates querying a Storage Map named `FreeBalance` from a module named
"Balances" for the balance of the familiar `Alice` account. In this example, the `FreeBalance` map
is using
[the transparent Blake2 128 Concat hashing algorithm](../runtime/storage#Transparent-Hashing-Algorithms):

```
twox_128("Balances) = "0xc2261276cc9d1f8598ea4b6a74b15c2f"
twox_128("FreeBalance") = "0x6482b9ade7bc6657aaca787ba1add3b4"
scale_encode("5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY") = "0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d"
blake2_128_concat("0xd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d") = "0xde1e86a9a8c739864cf3cc5ec2bea59fd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d"
state_getStorage("0xc2261276cc9d1f8598ea4b6a74b15c2f6482b9ade7bc6657aaca787ba1add3b4de1e86a9a8c739864cf3cc5ec2bea59fd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d") = "0x0000a0dec5adc9353600000000000000"
```

The value that is returned from the storage query (`"0x0000a0dec5adc9353600000000000000"` in the
example above) is the [SCALE](./codec)-encoded value of Alice's account balance
(`"1000000000000000000000"` in this example). Notice that before hashing Alice's account ID it has
to be SCALE-encoded. Also notice that the output of the `blake2_128_concat` function consists of 32
hexadecimal characters followed by the function's input. This is because the Blake2 128 Concat is
[a transparent hashing algorithm](../runtime/storage#Transparent-Hashing-Algorithms). Although the
above example may make this characteristic seem superfluous, its utility becomes more apparent when
the goal is to iterate over the keys in a map (as opposed to retrieving the value associated with a
single key). The ability to iterate over the keys in a map is a common requirement in order to allow
_people_ to use the map in a way that seems natural (such as UIs): first, a user is presented with a
list of elements in the map, then, that user can select the element that they are interested in and
query the map for more details about that particular element. Here is another example that uses the
same example Storage Map (a map named `FreeBalances` that uses a Blake2 128 Concat hashing algorithm
in a module named "Balances") that will demonstrate using the Substrate RPC to query a Storage Map
for its list of keys via the `state_getKeys` RPC endpoint:

```
twox_128("Balances) = "0xc2261276cc9d1f8598ea4b6a74b15c2f"
twox_128("FreeBalance") = "0x6482b9ade7bc6657aaca787ba1add3b4"
state_getKeys("0xc2261276cc9d1f8598ea4b6a74b15c2f6482b9ade7bc6657aaca787ba1add3b4") = [
"0xc2261276cc9d1f8598ea4b6a74b15c2f6482b9ade7bc6657aaca787ba1add3b4de1e86a9a8c739864cf3cc5ec2bea59fd43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d",
"0xc2261276cc9d1f8598ea4b6a74b15c2f6482b9ade7bc6657aaca787ba1add3b432a5935f6edc617ae178fef9eb1e211fbe5ddb1579b72e84524fc29e78609e3caf42e85aa118ebfe0b0ad404b5bdd25f",
...
]
```

Each element in the list that is returned by the Substrate RPC's `state_getKeys` endpoint can be
directly used as input for the RPC's `state_getStorage` endpoint. In fact, the first element in the
example list above is equal to the input used for the `state_getStorage` query in the previous
example (the one used to find the balance for `Alice`). Because the map that these keys belong to
uses a transparent hashing algorithm to generate its keys, it is possible to determine the account
associated with the second element in the list. Notice that each element in the list is a
hexadecimal value that begins with the same 64 characters; this is because each list element
represents a key in the same map, and that map is identified by concatenating two TwoX 128 hashes,
each of which are 128-bits or 32 hexadecimal characters. After discarding this portion of the second
element in the list, you are left with
`0x32a5935f6edc617ae178fef9eb1e211fbe5ddb1579b72e84524fc29e78609e3caf42e85aa118ebfe0b0ad404b5bdd25f`.

You saw in the previous example that this represents the Blake2 128 Concat hash of some
[SCALE](./codec)-encoded account ID. The Blake 128 Concat hashing algorithm consists of appending
(concatenating) the hashing algorithm's input to its Blake 128 hash. This means that the first 128
bits (or 32 hexadecimal characters) of a Blake2 128 Concat hash represents a Blake2 128 hash, and
the remainder represents the value that was passed to the Blake 2 128 hashing algorithm. In this
example, after you remove the first 32 hexadecimal characters that represent the Blake2 128 hash
(i.e. `0x32a5935f6edc617ae178fef9eb1e211f`) what is left is the hexadecimal value
`0xbe5ddb1579b72e84524fc29e78609e3caf42e85aa118ebfe0b0ad404b5bdd25f`, which is a
[SCALE](./codec)-encoded account ID. Decoding this value yields the result
`5GNJqTPyNqANBkUVMN1LPPrxXnFouWXoe2wNSmmEoLctxiZY`, which is the account ID for the familiar
`Alice_Stash` account.

## Runtime Storage API

The Substrate's [Support module](https://substrate.dev/rustdocs/master/frame_support/index.html)
provides utilities to generate unique, deterministic keys for your runtime module storage items.
These storage items are placed in the state trie and are accessible by querying the trie by key.
Substrate's [FRAME Support crate](https://substrate.dev/rustdocs/master/frame_support/index.html)
provides utilities for generating unique, deterministic keys for your runtime's storage items. These
storage items are placed in the [state trie](#Trie-Abstraction) and are accessible by
[querying the trie by key](#Querying-Storage).

## Next Steps

Expand Down
Loading

0 comments on commit a214b8b

Please sign in to comment.