Add storage cache for child trie and notification internals #2639

cheme · 2019-05-21T09:50:13Z

This PR add child trie (CT) key value storage to the cache (see changes on into_committed method).

Notification part

From this point this PR also plug child trie values in the notification mechanism:

CT storage changes use the same subscription as top trie
wildcard on top does not mean wildcard on CT
RPC remain unchanged at this point (non breaking PR, another one will be needed to change storage subscription)

@tomusdrw @jacogr and all, those choice may not be really adequate (if we want a separate api for subscribing to ct value, part of this pr is wrong), should we consider the two first points ok?

storage cache part

quite straight forward but keeps pointing to the fact that CT api may be merge into standard api : lot of redundancy as in all CT code.
One bad point of the pr is the use of (Option<Vec<u8>>, Vec<u8>) as a key of lru cache (cannot use two cache as usual since a single lru seems better).
This leads to very awkward vec instantation on query (no Borrow implementation for tuple).
I think, but this is dependant on wether we want to unify ct api into parent api, that a StoragePath struct with encoded offset could do the job (depending on property of ord offset encoding position would change).
StoragePath would replace the couple child storage key and child key by an expanded path with position of the child switch in the path. This seems like an adequate representation (ct are quite similar to a standard trie branch but with a hop).

tomusdrw · 2019-05-22T07:45:02Z

core/client/db/src/lib.rs

-	fn update_storage(&mut self, update: Vec<(Vec<u8>, Option<Vec<u8>>)>) -> Result<(), client::error::Error> {
+	fn update_storage(
+		&mut self,
+		update: Vec<(Vec<u8>,Option<Vec<u8>>)>,


Do you think it's worth introduing some typedefs for this to improve readability?
Now the type(s) seem pretty complicated (a lot of Vec and u8 mixed together :))

tomusdrw · 2019-05-22T07:48:23Z

core/client/db/src/storage_cache.rs

@@ -27,7 +27,7 @@ use log::trace;

 const STATE_CACHE_BLOCKS: usize = 12;

-type StorageKey = Vec<u8>;
+type StorageKey = (Option<Vec<u8>>, Vec<u8>);


Might be worth introducing a proper struct for this? We could have StorageKey::root(key) or StorageKey::child(child_key, key) for instantiation

tomusdrw · 2019-05-22T07:49:18Z

core/client/db/src/storage_cache.rs

+			let childs = child_changes
+				.into_iter()
+				.map(|(k,i)|(Some(k),i))
+				.chain(::std::iter::once((None, changes)));


:: shouldn't be required in edition=2018

Suggested change

.chain(::std::iter::once((None, changes)));

.chain(std::iter::once((None, changes)));

tomusdrw · 2019-05-22T07:50:42Z

core/client/db/src/storage_cache.rs

+			childs.for_each(|(sk, changes)|
+				for (k, v) in changes.into_iter() {
+					let k = (sk.clone(), k);
+					if is_best {


You can avoid else clause and duplication without additional clones like this:

if is_best { cache.hashes.remove(&k); CachingState::<H, S, B>::storage_insert(cache, k.clone(), v); } modifications.insert(k);

tomusdrw · 2019-05-22T07:51:30Z

core/client/db/src/storage_cache.rs

+				.chain(::std::iter::once((None, changes)));
+			childs.for_each(|(sk, changes)|
+				for (k, v) in changes.into_iter() {
+					let k = (sk.clone(), k);


Is there a way to prevent sk.clone() here? Could we use Cow for the child_key?

This would need the tuple to derive borrow, which is probably not correct: relates to the idea of having a proper StorageKey type (and make this type such as it can implement borrow without instantiation (keeping an internal Vec with offset seems like the easiest way to me).

Ok, my previous comment does probably not make much sense, HStorageKey is interesting but the problematic here is not that complicated.
The only simple non-break-all-api way of solving it I see is to put child in their own map storage. As said before it breaks lru expectation.
@tomusdrw I did notice that there is already two lru map, is it intended or is a single lru for hashes and decoded value a better choice?
So the way to fix this could be to use directly the internal lru storage (some doublelinkedhashmap) and manage lru limit size globally for CachingState (and of course move child cache to their own two level map). This lru size management does not looks to complicated and this will allow to have a single lru size.

I'm not familiar with how caches are implemented, however it seems that currently child storage is used mostly for contracts, this means that having two separate lru's might make sense, cause we are switching between runtime context (that has it's own top-level cache) and contract context (that has it's own child cache). So after finishing the contract context we will still have all necessary top level items in the cache, no matter what contract did. I'm not sure if that's something we are going to keep though, if we do it might require a different solution.

There is also the case of contract talking to each others, but I think you spot it: having separate mgmt could make sense (top level could be seen as more useful and child contract lru could use another size), plus it is more direct to implement 👍

Ok, I am not awake 🤦‍♂️ , there is already the kind of merged lru logic that I describe (we just got some overhead by using lru instead of its inner linkedhashmap struct) so I just need to keep using it that way

substrate/core/client/db/src/storage_cache.rs

Line 58 in 69dd3b1

pub fn new_shared_cache<B: Block, H: Hasher>(shared_cache_size: usize) -> SharedCache<B, H> {

and

substrate/core/client/db/src/storage_cache.rs

Line 136 in 69dd3b1

if let Some(v_) = &v {

tomusdrw · 2019-05-22T08:07:17Z

core/client/src/notifications.rs

 		self.next_id += 1;
+		let next_id = self.next_id;


I'd rather call it current_id

tomusdrw · 2019-05-22T08:08:38Z

core/client/src/notifications.rs

+					.entry(c_key.clone())
+					.or_insert_with(Default::default);
+
+				(c_key.clone(), if let Some(keys) = o_keys {


The code feels complicated here, I'd try to extract it similar as in notify, so that:

add_listeners(filter_keys, &mut self.wildcard_listeners, &mut self.listeners);

can be re-used for top-level keys and child keys

core/client/src/notifications.rs

tomusdrw · 2019-05-22T08:10:06Z

core/client/src/notifications.rs

+			let child_filters = Some([
+				(StorageKey(vec![4]), None),
+				(StorageKey(vec![5]), None),
+			].into_iter().cloned().collect());


why into_iter().cloned().collect() can't you just do vec![...] directly or is it a HashSet?

yes HashMap

Co-Authored-By: Tomasz Drwięga <[email protected]>

cheme · 2019-05-22T19:52:43Z

I've been trying a bit to use a single key for child (see HStorageKey of latest commit), but it seems to be counterproductive: this is quite orthogonal to #2209, and trying to generalize its usage is pushing thing way to far.
I think I will drop this approach and create a pr as @tomusdrw suggested.
Maybe if #2209 change to a full key path approach HStorageKey could be resurected (or for higher level operation where cycle need to be forbid: but the Ord property might not be interesting and a standard encoding may be more straight forward).
Edit: removed, HStorage ordering preservation could maybe make sense for #2622

cheme · 2019-05-24T09:42:31Z

In latest commit:

fix the lru caches to use memory estimation for all lru (removing very small overhead of lrucache).
change the lru caches to relate to configuration memory limit
change rpc calls to use the hashes cache
apply ratio over the different lru cache

There is still a problem with those lru cache: that is 4 lru list sharing a size limit, so the removal strategy being local (last use of the lru list where we insert content) a
lru can deplete another one and it is basically first using takes memory.
I am thinking on using 4 counter with a ratio of given memory (child ratio and hash ratio need to be defined and combined (hash first applied, child on result)).
This change would also allow a cleaner lru struct than the current tools method.
Any opinion on default value for them: I would go hash 10% and child 2% (assuming there is no child use and if there is you use it you really want to change that value to 90% or such) ?

gavofyork · 2019-06-04T08:08:33Z

@cheme who can review this effectively? @arkpar ?

cheme · 2019-06-04T08:22:32Z

The PR just copy what is done on general storage for child storage, but there is a bit of a gray area around the actual use case of the shared cache (difference with local cache).
So it would be best if the original code writer can take a peek, cc @arkpar (I would also welcome better design idea than the ratio I put in place).

arkpar · 2019-06-07T09:48:48Z

Hash queries are only used for CODE and maybe for a couple of other keys at the moment. So the cache for hashes does not grow really. I'd put a fixed limit for hashes for now. 64 kilobytes maybe. As for the main/child ratio let's make it 50/50 but add an additional configuration option that enables/disables child tree cache.

arkpar · 2019-06-07T09:53:00Z

Also, I would not bother taking any overhead into size calculations. Such as internal overhead for linked hash map items or vector capacity. These highly depend on the internals of external libraries and the allocator, that might change with a new version. We'd just say that that the configured size is the data size and not the consumed memory size, which might be 20% or so higher on average.

Removal of child-trie-hash lru. Fix lru storage value for hashes lru.

cheme · 2019-06-07T14:18:55Z

So FWIU, hashes lru use is pretty limited, so I switch to proposed fix length value of 64ko and remove the hashes lru ratio configuration.

Similarily lru_child_hashes probably does not make much sense, so I remove it too.

Also removed the fix per element overhead, I wanted it to account for the use case of spammed small key value, but this usecase is partially limited by the key length that need to grow to grow in value.
If I want to put it back, it probably make more sense to apply it directly when calling accessing the lru size and simply add number of element * fix cost.

arkpar · 2019-06-14T08:12:14Z

core/client/db/src/storage_cache.rs

 			+ self.lru_child_storage.used_size()
-			+ self.lru_child_hashes.used_size()
+			//  ignero small hashes storage + self.lru_hashes.used_size()


* child cache, and test failing notifications * fix tests and no listen child on top wildcard * remove useless method * bump impl version * Update core/client/src/notifications.rs Co-Authored-By: Tomasz Drwięga <[email protected]> * Update core/client/src/notifications.rs Co-Authored-By: Tomasz Drwięga <[email protected]> * Update core/client/src/notifications.rs Co-Authored-By: Tomasz Drwięga <[email protected]> * Update core/client/src/notifications.rs Co-Authored-By: Tomasz Drwięga <[email protected]> * factoring notification methods to remove some redundant code. * test child sub removal * HStorage implementation and some type alias. * Remove HStorage cache: does not fit * fix removal * Make cache use byte length (shared) instead of number of kv * Make use of hashes cache in rpc * applying ratio on different lru caches * Fix format * break a line * Remove per element overhead of lru cache. * typo

…ch#2639) * child cache, and test failing notifications * fix tests and no listen child on top wildcard * remove useless method * bump impl version * Update core/client/src/notifications.rs Co-Authored-By: Tomasz Drwięga <[email protected]> * Update core/client/src/notifications.rs Co-Authored-By: Tomasz Drwięga <[email protected]> * Update core/client/src/notifications.rs Co-Authored-By: Tomasz Drwięga <[email protected]> * Update core/client/src/notifications.rs Co-Authored-By: Tomasz Drwięga <[email protected]> * factoring notification methods to remove some redundant code. * test child sub removal * HStorage implementation and some type alias. * Remove HStorage cache: does not fit * fix removal * Make cache use byte length (shared) instead of number of kv * Make use of hashes cache in rpc * applying ratio on different lru caches * Fix format * break a line * Remove per element overhead of lru cache. * typo

cheme added 4 commits May 20, 2019 21:21

child cache, and test failing notifications

a45f081

fix tests and no listen child on top wildcard

0014ed0

remove useless method

09d8871

bump impl version

22be9c8

cheme added the Z7-question Issue is a question. Closer should answer. label May 21, 2019

tomusdrw reviewed May 22, 2019

View reviewed changes

cheme and others added 7 commits May 22, 2019 10:56

Update core/client/src/notifications.rs

484418b

Co-Authored-By: Tomasz Drwięga <[email protected]>

Update core/client/src/notifications.rs

2748817

Co-Authored-By: Tomasz Drwięga <[email protected]>

Update core/client/src/notifications.rs

d784797

Co-Authored-By: Tomasz Drwięga <[email protected]>

Update core/client/src/notifications.rs

e942dc7

Co-Authored-By: Tomasz Drwięga <[email protected]>

factoring notification methods to remove some redundant code.

d123041

test child sub removal

88029a8

HStorage implementation and some type alias.

473207d

cheme added 4 commits May 22, 2019 23:16

Remove HStorage cache: does not fit

7b67ac3

fix removal

a4eec94

Make cache use byte length (shared) instead of number of kv

fd2187c

Make use of hashes cache in rpc

7d89303

cheme added 3 commits May 24, 2019 11:45

Merge branch 'master' into child_cache

42de138

applying ratio on different lru caches

3111860

Merge branch 'master' into child_cache

3d5c784

cheme added A0-please_review Pull request needs code review. and removed Z7-question Issue is a question. Closer should answer. labels May 24, 2019

cheme added 3 commits May 24, 2019 18:14

Fix format

a6f19a9

break a line

fdf8945

Merge branch 'master' into child_cache, and a few doc improvement.

95865ad

Merge branch 'master' into child_cache

1261df1

Merge branch 'master' into child_cache

e1a30e4

cheme added 2 commits June 7, 2019 15:46

Merge branch 'master' into child_cache.

57e773a

Removal of child-trie-hash lru. Fix lru storage value for hashes lru.

Remove per element overhead of lru cache.

ec16a15

arkpar reviewed Jun 14, 2019

View reviewed changes

arkpar approved these changes Jun 14, 2019

View reviewed changes

arkpar added A7-looksgoodcantmerge and removed A0-please_review Pull request needs code review. labels Jun 14, 2019

cheme added 2 commits June 14, 2019 10:40

typo

d0574c9

Merge branch 'master' into child_cache

16975ea

arkpar added A8-backport and removed A7-looksgoodcantmerge labels Jun 14, 2019

gavofyork merged commit 879e4a8 into paritytech:master Jun 14, 2019

arkpar added A8-looksgood and removed A8-backport labels Jun 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add storage cache for child trie and notification internals #2639

Add storage cache for child trie and notification internals #2639

cheme commented May 21, 2019 •

edited

Loading

tomusdrw May 22, 2019

tomusdrw May 22, 2019

tomusdrw May 22, 2019

tomusdrw May 22, 2019

tomusdrw May 22, 2019

cheme May 22, 2019

cheme May 23, 2019

tomusdrw May 23, 2019

cheme May 23, 2019

cheme May 23, 2019

tomusdrw May 22, 2019

tomusdrw May 22, 2019

tomusdrw May 22, 2019

cheme May 22, 2019

cheme commented May 22, 2019 •

edited

Loading

cheme commented May 24, 2019 •

edited

Loading

gavofyork commented Jun 4, 2019

cheme commented Jun 4, 2019

arkpar commented Jun 7, 2019

arkpar commented Jun 7, 2019

cheme commented Jun 7, 2019

arkpar Jun 14, 2019

	.chain(::std::iter::once((None, changes)));
	.chain(std::iter::once((None, changes)));

Add storage cache for child trie and notification internals #2639

Add storage cache for child trie and notification internals #2639

Conversation

cheme commented May 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cheme commented May 22, 2019 • edited Loading

cheme commented May 24, 2019 • edited Loading

gavofyork commented Jun 4, 2019

cheme commented Jun 4, 2019

arkpar commented Jun 7, 2019

arkpar commented Jun 7, 2019

cheme commented Jun 7, 2019

Choose a reason for hiding this comment

cheme commented May 21, 2019 •

edited

Loading

cheme commented May 22, 2019 •

edited

Loading

cheme commented May 24, 2019 •

edited

Loading