Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Rework storage iterators #13284

Merged
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
523ddf5
Rework storage iterators
koute Jan 31, 2023
8497910
Make sure storage iteration is also accounted for when benchmarking
koute Jan 31, 2023
e6b35c9
Use `trie-db` from crates.io
koute Feb 3, 2023
6b640c8
Merge remote-tracking branch 'origin/master' into master_rework_stora…
koute Feb 3, 2023
2556581
Appease clippy
koute Feb 3, 2023
1f1ff7a
Bump `trie-bench` to 0.35.0
koute Feb 3, 2023
fa3edbd
Fix tests' compilation
koute Feb 3, 2023
4be0cad
Update comment to clarify how `IterArgs::start_at` works
koute Feb 3, 2023
825ba75
Add extra tests
koute Feb 3, 2023
65d99ed
Fix iterators on `Client` so that they behave as before
koute Feb 3, 2023
ab0fff6
Add extra `unwrap`s in tests
koute Feb 3, 2023
2e41604
More clippy fixes
koute Feb 3, 2023
b5ce5f5
Come on clippy, give me a break already
koute Feb 3, 2023
104894b
Merge remote-tracking branch 'origin/master' into master_rework_stora…
koute Feb 6, 2023
e049042
Rename `allow_missing` to `stop_on_incomplete_database`
koute Feb 7, 2023
079e6ab
Merge remote-tracking branch 'origin/master' into master_rework_stora…
koute Feb 13, 2023
387c600
Add `#[inline]` to `with_recorder_and_cache`
koute Feb 13, 2023
962cdbc
Use `with_recorder_and_cache` in `with_trie_db`; add doc comment
koute Feb 13, 2023
6582bc9
Simplify code: use `with_trie_db` in `next_storage_key_from_root`
koute Feb 13, 2023
0569cb8
Remove `expect`s in the benchmarking CLI
koute Feb 13, 2023
0c332c6
Add extra doc comments
koute Feb 13, 2023
4717063
Move `RawIter` before `TrieBackendEssence` (no code changes; just cut…
koute Feb 13, 2023
2516435
Remove a TODO in tests
koute Feb 13, 2023
fa585d9
Update comment for `StorageIterator::was_complete`
koute Feb 13, 2023
17b1356
Merge remote-tracking branch 'origin/master' into master_rework_stora…
koute Feb 14, 2023
fdb6b63
Update `trie-db` to 0.25.1
koute Feb 14, 2023
fedc9f3
Merge remote-tracking branch 'origin/master' into master_rework_stora…
koute Feb 21, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 15 additions & 11 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

186 changes: 126 additions & 60 deletions client/api/src/backend.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,14 +31,13 @@ use sp_runtime::{
Justification, Justifications, StateVersion, Storage,
};
use sp_state_machine::{
backend::AsTrieBackend, ChildStorageCollection, IndexOperation, OffchainChangesCollection,
StorageCollection,
backend::AsTrieBackend, ChildStorageCollection, IndexOperation, IterArgs,
OffchainChangesCollection, StorageCollection, StorageIterator,
};
use sp_storage::{ChildInfo, StorageData, StorageKey};
use std::collections::{HashMap, HashSet};

pub use sp_state_machine::{Backend as StateBackend, KeyValueStates};
use std::marker::PhantomData;

/// Extracts the state backend type for the given backend.
pub type StateBackendFor<B, Block> = <B as Backend<Block>>::State;
Expand Down Expand Up @@ -303,58 +302,140 @@ pub trait AuxStore {
}

/// An `Iterator` that iterates keys in a given block under a prefix.
pub struct KeyIterator<State, Block> {
pub struct KeysIter<State, Block>
where
State: StateBackend<HashFor<Block>>,
Block: BlockT,
{
inner: <State as StateBackend<HashFor<Block>>>::RawIter,
state: State,
child_storage: Option<ChildInfo>,
prefix: Option<StorageKey>,
current_key: Vec<u8>,
_phantom: PhantomData<Block>,
skip_if_first: Option<StorageKey>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the iterators on the storage backend and on Client work differently regarding the start_at.

For the backend if a start_at is specified then it will seek to that key and it will always include that key in the iterator's output. For the Client if the key specified in start_at matches exactly the first key it encounters then it skips that key.

This inconsistency is a little confusing, but that's how it works.

I've added some extra tests that verify this behavior, and I've also checked that they pass before my changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's a point to be extremely careful with, retaining the sometime ~ behavior of iteration calls from the existing host functions (all that is touched by clear_prefix from sp_externalities).
I remember something ~ with the way error are handled, but I may be confusing with something else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'll probably run a full burn-in before merging this just to make sure nothing's broken.

}

impl<State, Block> KeyIterator<State, Block> {
/// create a KeyIterator instance
pub fn new(state: State, prefix: Option<StorageKey>, current_key: Vec<u8>) -> Self {
Self { state, child_storage: None, prefix, current_key, _phantom: PhantomData }
impl<State, Block> KeysIter<State, Block>
where
State: StateBackend<HashFor<Block>>,
Block: BlockT,
{
/// Create a new iterator over storage keys.
pub fn new(
state: State,
prefix: Option<&StorageKey>,
start_at: Option<&StorageKey>,
) -> Result<Self, State::Error> {
let mut args = IterArgs::default();
args.prefix = prefix.as_ref().map(|prefix| prefix.0.as_slice());
args.start_at = start_at.as_ref().map(|start_at| start_at.0.as_slice());

let start_at = args.start_at;
Ok(Self {
inner: state.raw_iter(args)?,
state,
skip_if_first: start_at.map(|key| StorageKey(key.to_vec())),
})
}

/// Create a `KeyIterator` instance for a child storage.
/// Create a new iterator over a child storage's keys.
pub fn new_child(
state: State,
child_info: ChildInfo,
prefix: Option<StorageKey>,
current_key: Vec<u8>,
) -> Self {
Self { state, child_storage: Some(child_info), prefix, current_key, _phantom: PhantomData }
prefix: Option<&StorageKey>,
start_at: Option<&StorageKey>,
) -> Result<Self, State::Error> {
let mut args = IterArgs::default();
args.prefix = prefix.as_ref().map(|prefix| prefix.0.as_slice());
args.start_at = start_at.as_ref().map(|start_at| start_at.0.as_slice());
args.child_info = Some(child_info);

let start_at = args.start_at;
Ok(Self {
inner: state.raw_iter(args)?,
state,
skip_if_first: start_at.map(|key| StorageKey(key.to_vec())),
})
}
}

impl<State, Block> Iterator for KeyIterator<State, Block>
impl<State, Block> Iterator for KeysIter<State, Block>
where
Block: BlockT,
State: StateBackend<HashFor<Block>>,
{
type Item = StorageKey;

fn next(&mut self) -> Option<Self::Item> {
let next_key = if let Some(child_info) = self.child_storage.as_ref() {
self.state.next_child_storage_key(child_info, &self.current_key)
} else {
self.state.next_storage_key(&self.current_key)
let key = self.inner.next_key(&self.state)?.ok().map(StorageKey)?;

if let Some(skipped_key) = self.skip_if_first.take() {
melekes marked this conversation as resolved.
Show resolved Hide resolved
if key == skipped_key {
return self.next()
}
}
.ok()
.flatten()?;
// this terminates the iterator the first time it fails.
if let Some(ref prefix) = self.prefix {
if !next_key.starts_with(&prefix.0[..]) {
return None

Some(key)
}
}

/// An `Iterator` that iterates keys and values in a given block under a prefix.
pub struct PairsIter<State, Block>
where
State: StateBackend<HashFor<Block>>,
Block: BlockT,
{
inner: <State as StateBackend<HashFor<Block>>>::RawIter,
state: State,
skip_if_first: Option<StorageKey>,
}

impl<State, Block> Iterator for PairsIter<State, Block>
where
Block: BlockT,
State: StateBackend<HashFor<Block>>,
{
type Item = (StorageKey, StorageData);

fn next(&mut self) -> Option<Self::Item> {
let (key, value) = self
.inner
.next_pair(&self.state)?
.ok()
.map(|(key, value)| (StorageKey(key), StorageData(value)))?;

if let Some(skipped_key) = self.skip_if_first.take() {
if key == skipped_key {
return self.next()
}
}
self.current_key = next_key.clone();
Some(StorageKey(next_key))

Some((key, value))
}
}

/// Provides acess to storage primitives
impl<State, Block> PairsIter<State, Block>
where
State: StateBackend<HashFor<Block>>,
Block: BlockT,
{
/// Create a new iterator over storage key and value pairs.
pub fn new(
state: State,
prefix: Option<&StorageKey>,
start_at: Option<&StorageKey>,
) -> Result<Self, State::Error> {
let mut args = IterArgs::default();
args.prefix = prefix.as_ref().map(|prefix| prefix.0.as_slice());
args.start_at = start_at.as_ref().map(|start_at| start_at.0.as_slice());

let start_at = args.start_at;
Ok(Self {
inner: state.raw_iter(args)?,
state,
skip_if_first: start_at.map(|key| StorageKey(key.to_vec())),
})
}
}

/// Provides access to storage primitives
pub trait StorageProvider<Block: BlockT, B: Backend<Block>> {
/// Given a block's `Hash` and a key, return the value under the key in that block.
fn storage(
Expand All @@ -363,36 +444,30 @@ pub trait StorageProvider<Block: BlockT, B: Backend<Block>> {
key: &StorageKey,
) -> sp_blockchain::Result<Option<StorageData>>;

/// Given a block's `Hash` and a key prefix, return the matching storage keys in that block.
fn storage_keys(
&self,
hash: Block::Hash,
key_prefix: &StorageKey,
) -> sp_blockchain::Result<Vec<StorageKey>>;

/// Given a block's `Hash` and a key, return the value under the hash in that block.
fn storage_hash(
&self,
hash: Block::Hash,
key: &StorageKey,
) -> sp_blockchain::Result<Option<Block::Hash>>;

/// Given a block's `Hash` and a key prefix, return the matching child storage keys and values
/// in that block.
fn storage_pairs(
/// Given a block's `Hash` and a key prefix, returns a `KeysIter` iterates matching storage
/// keys in that block.
fn storage_keys(
&self,
hash: Block::Hash,
key_prefix: &StorageKey,
) -> sp_blockchain::Result<Vec<(StorageKey, StorageData)>>;
prefix: Option<&StorageKey>,
start_key: Option<&StorageKey>,
) -> sp_blockchain::Result<KeysIter<B::State, Block>>;

/// Given a block's `Hash` and a key prefix, return a `KeyIterator` iterates matching storage
/// keys in that block.
fn storage_keys_iter(
/// Given a block's `Hash` and a key prefix, returns an iterator over the storage keys and
/// values in that block.
fn storage_pairs(
&self,
hash: Block::Hash,
hash: <Block as BlockT>::Hash,
prefix: Option<&StorageKey>,
start_key: Option<&StorageKey>,
) -> sp_blockchain::Result<KeyIterator<B::State, Block>>;
) -> sp_blockchain::Result<PairsIter<B::State, Block>>;

/// Given a block's `Hash`, a key and a child storage key, return the value under the key in
/// that block.
Expand All @@ -403,24 +478,15 @@ pub trait StorageProvider<Block: BlockT, B: Backend<Block>> {
key: &StorageKey,
) -> sp_blockchain::Result<Option<StorageData>>;

/// Given a block's `Hash`, a key prefix, and a child storage key, return the matching child
/// storage keys.
fn child_storage_keys(
&self,
hash: Block::Hash,
child_info: &ChildInfo,
key_prefix: &StorageKey,
) -> sp_blockchain::Result<Vec<StorageKey>>;

/// Given a block's `Hash` and a key `prefix` and a child storage key,
/// return a `KeyIterator` that iterates matching storage keys in that block.
fn child_storage_keys_iter(
/// returns a `KeysIter` that iterates matching storage keys in that block.
fn child_storage_keys(
&self,
hash: Block::Hash,
child_info: ChildInfo,
prefix: Option<&StorageKey>,
start_key: Option<&StorageKey>,
) -> sp_blockchain::Result<KeyIterator<B::State, Block>>;
) -> sp_blockchain::Result<KeysIter<B::State, Block>>;

/// Given a block's `Hash`, a key and a child storage key, return the hash under the key in that
/// block.
Expand Down
1 change: 1 addition & 0 deletions client/db/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ quickcheck = { version = "1.0.3", default-features = false }
kitchensink-runtime = { path = "../../bin/node/runtime" }
sp-tracing = { version = "6.0.0", path = "../../primitives/tracing" }
substrate-test-runtime-client = { version = "2.0.0", path = "../../test-utils/runtime/client" }
array-bytes = "4.1"

[features]
default = []
Expand Down
Loading