Node table limiting and cache for node filter #10288

VladLupashevskyi · 2019-02-03T17:05:35Z

After removing node filter cache (#10121), the nodes were still loosing the the peers, however in this case the time when it starts to loose them was very dependent on the CPU power, so when I was testing the fix it was working well for couple of days on my i7 4th gen, but when I left for longer time peers loosing started to occur again.

I have done some research and found out that the problem was in this line:
https://github.com/paritytech/parity-ethereum/blob/06cae8a53556dede67e163325657bee3fba8feda/util/network-devp2p/src/host.rs#L598

where it was querying the contract on every state. But the interesting thing I found that the node table was not limited in size and was always growing. So after 4 hrs of running there were already 10k entries, so the state was queried 10k times each second. In this case introducing the cache is not really a solution, because once we hit the limit of it the querying starts again.

First of all I thought I found a typo in update_table fn where ; was missing after for loop:
https://github.com/paritytech/parity-ethereum/blob/06cae8a53556dede67e163325657bee3fba8feda/util/network-devp2p/src/node_table.rs#L349
But because of lack of experience in rust adding ; was not helpful.

So I decided to introduce the node table limiting which would allow to use caching for Node Filter, but also guarantee more predictable behaviour for nodes that don't use it, so the large size won't be causing any performance issues when node runs for very long time.

The limiting is done in the way that in case of adding a new node and hitting the limit, the least important node gets removed (either failure one with the most recent last_connected state, or node with unknown last_connected state, or the node with the oldest successful last_connected state).

I introduced simple cache for Node Filter based on RwLock which is reset on the new block. It's not a perfect solution, but I would say more generic, and especially useful for the chains where the filter gets dynamically updated. As further improvement I would see an opportunity to introduce a new version of node permission contract where the event NodeTableUpdated would be emitted when the list of allowed nodes is updated and clear the cache in this case, or for current contract version - checking the txes to the node permission contract on each block and clear the cache if they are present there (not sure how it would affect the performance though, but I believe it's better than just clearing the cache on each block).

I have done some checks and on current 2.3.0-beta when node filter contract is enabled, the CPU has 30% of load in the beginning and keeps growing, with this update the CPU load is constantly about 15-30% on AuRa chain with block time 4 seconds. When the contract is disabled the load is 5-10%.

Looking forward to get any comments or suggestions to these changes :)

…into node-table-limiting

parity-cla-bot · 2019-02-03T17:05:39Z

It looks like @VladLupashevskyi signed our Contributor License Agreement. 👍

Many thanks,

Parity Technologies CLA Bot

parity-cla-bot · 2019-02-03T17:05:39Z

It looks like @VladLupashevskyi signed our Contributor License Agreement. 👍

Many thanks,

Parity Technologies CLA Bot

HCastano

I took a quick peek through the code and it looks like there are a few .unwrap() calls throughout. As a general rule we try and avoid panics, so you're either going to need to:

a) Remove the call to .unwrap()and handle the failure case more gracefully
b) Write a little proof stating why this will not panic, and include that within an .expect() call

ascjones

Thanks for the PR. Just realised I had left this review in progress, but similar comments to @HCastano regarding unwraps.

util/network-devp2p/src/node_table.rs

…into node-table-limiting

…success + fix expect msg

util/network-devp2p/src/node_table.rs

ethcore/node-filter/src/lib.rs

grbIzl · 2019-02-12T11:59:54Z

ethcore/node-filter/src/lib.rs


 use_contract!(peer_set, "res/peer_set.json");

 /// Connection filter that uses a contract to manage permissions.
 pub struct NodeFilter {
 	client: Weak<BlockChainClient>,
 	contract_address: Address,
+	cache: RwLock<Cache>


If we understand the real reason for CPU overload (not limited Node table), can we just rollback #10143 and return previous LRUCache, that was used here? I frankly don't see the advantages of proposed solution in comparison with the previous one.

The idea of implementing it via RwLock was to make it possible to read the cache from different threads without blocking each other.

Ok. Than may be just change previous Mutex to RwLock? Because order field in Cache is just the implementation of Lru

Got it! I think that makes sense. Sorry I didn't notice and I was thinking that LruCache comes already with built-in Mutex 😅

Actually I cannot really use LruCache since it does not provide non-mutable get method, so it's not possible to use RwLock with it

What about the peek method: https://docs.rs/lru/0.1.13/lru/struct.LruCache.html#method.peek ?

grbIzl · 2019-02-12T13:50:41Z

@VladLupashevskyi In this part of code: https://github.com/paritytech/parity-ethereum/blob/d89b8d904ff28382d3692290a936706304627255/util/network-devp2p/src/host.rs#L595 we create an iterator over the node table and take some amount of nodes from it (~30 nodes). So the only scenario, when such performance degradation is possible, is when you have a lot of nodes forbidden in your permission contract and iterator goes over them, looking for allowed. I think, that it will be true for whitelisting permission contracts. For blacklisting contracts this code should return the required amount of nodes quicker.

VladLupashevskyi · 2019-02-12T14:21:48Z

@VladLupashevskyi In this part of code:

parity-ethereum/util/network-devp2p/src/host.rs

Line 595 in d89b8d9

for id in nodes.filter(|id|
we create an iterator over the node table and take some amount of nodes from it (~30 nodes). So the only scenario, when such performance degradation is possible, is when you have a lot of nodes forbidden in your permission contract and iterator goes over them, looking for allowed. I think, that it will be true for whitelisting permission contracts. For blacklisting contracts this code should return the required amount of nodes quicker.

Yes, you are absolutely right - I've experienced this behaviour when was using whitelisting permissioning in the contract. But shouldn't this contract have a generic design, allowing to do whitelisting and blacklisting depending on the design of the blockchain? I believe that first is very crucial in permissioned consortium chains.

grbIzl · 2019-02-12T14:28:04Z

But shouldn't this contract have a generic design, allowing to do whitelisting and blacklisting depending on the design of the blockchain? I believe that first is very crucial in permissioned consortium chains.

I'm not arguing, it was just an analysis to help better understand the issue. We indeed must support any type of permissioning contract.

util/network-devp2p/src/node_table.rs

…cond precision

VladLupashevskyi · 2019-02-19T21:24:47Z

Here I fixed some bugs with get_index_to_insert and added tests for it. I think the most appropriate place to test this function is in table_last_contact_order test, so in order to prevent its failing on macos due to lost nanoseconds precision I had to revert the fix 5a1dc3e was done by @debris, because the test become more complicated there and use sleep(Duration::from_micros(1)) instead as simple fix. I think it shouldn't affect the test execution time, however the code become slightly bulky.

VladLupashevskyi · 2019-02-19T21:28:53Z

I also removed get_mut method for NodeTable and introduced get instead to prevent occasional modification of the nodes HashMap.

VladLupashevskyi · 2019-03-19T13:41:10Z

Could somebody review this? We are looking forward to that PR being reviewed and merged if possible :)

grbIzl

I'm ok with these changes, but need more eyes for node_table new logic. @ngotchac can you help with one more review?

* master: CI improvements (#10579) Watch transactions pool (#10558) fix(evmbin): make benches compile again (#10586) fix issue with compilation when 'slow-blocks' feature enabled (#10585) Reject crazy timestamps instead of truncating. (#10574) Node table limiting and cache for node filter (#10288)

VladLupashevskyi added 4 commits February 3, 2019 17:54

Fix nasty typo in NodeTable::update (add ;)

119950a

Add limiting for NodeTable

997136c

Add cache for NodeFilter

2202646

Merge branch 'master' of https://github.com/paritytech/parity-ethereum …

7ed8e11

…into node-table-limiting

5chdn added this to the 2.4 milestone Feb 7, 2019

5chdn added A0-pleasereview 🤓 Pull request needs code review. M4-core ⛓ Core client code / Rust. M2-config 📂 Chain specifications and node configurations. labels Feb 7, 2019

HCastano reviewed Feb 8, 2019

View reviewed changes

ascjones suggested changes Feb 8, 2019

View reviewed changes

VladLupashevskyi added 3 commits February 8, 2019 19:17

Merge branch 'master' of https://github.com/paritytech/parity-ethereum …

4bc0abc

…into node-table-limiting

Use expect instead of unwrap

1dc11cb

Move node in ordered_ids if it exists there in note_failure and note_…

da8ad6b

…success + fix expect msg

grbIzl reviewed Feb 12, 2019

View reviewed changes

Add comment

59106fe

grbIzl reviewed Feb 13, 2019

View reviewed changes

util/network-devp2p/src/node_table.rs Outdated Show resolved Hide resolved

grbIzl reviewed Feb 13, 2019

View reviewed changes

util/network-devp2p/src/node_table.rs Outdated Show resolved Hide resolved

VladLupashevskyi added 7 commits February 19, 2019 16:23

Improve code style

75b73e0

DRY in note_failure and note_success

1b96ca0

Fix nodes ordering

5855dd0

Simplify match expression

cf43c7f

Add tests for get_index_to_insert

0adc892

Remove get_mut method from NodeTable, Add get method to NodeTable

fdc72c2

Fix table_last_contact_order for macos failing because of lost nanose…

5f73146

…cond precision

5chdn modified the milestones: 2.4, 2.5, 2.6 Feb 21, 2019

soc1c requested a review from ascjones February 22, 2019 14:35

grbIzl approved these changes Mar 28, 2019

View reviewed changes

grbIzl requested a review from ngotchac March 28, 2019 10:17

soc1c modified the milestones: 2.5, 2.6 Apr 2, 2019

soc1c added A8-looksgood 🦄 Pull request is reviewed well. and removed A0-pleasereview 🤓 Pull request needs code review. labels Apr 3, 2019

ascjones approved these changes Apr 5, 2019

View reviewed changes

soc1c merged commit 8132d38 into openethereum:master Apr 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node table limiting and cache for node filter #10288

Node table limiting and cache for node filter #10288

VladLupashevskyi commented Feb 3, 2019

parity-cla-bot commented Feb 3, 2019

parity-cla-bot commented Feb 3, 2019

HCastano left a comment

ascjones left a comment

grbIzl Feb 12, 2019

VladLupashevskyi Feb 12, 2019

grbIzl Feb 12, 2019

VladLupashevskyi Feb 12, 2019 •

edited

Loading

VladLupashevskyi Feb 12, 2019

ngotchac Apr 3, 2019

grbIzl commented Feb 12, 2019

VladLupashevskyi commented Feb 12, 2019

grbIzl commented Feb 12, 2019 •

edited

Loading

VladLupashevskyi commented Feb 19, 2019

VladLupashevskyi commented Feb 19, 2019

VladLupashevskyi commented Mar 19, 2019

grbIzl left a comment

Node table limiting and cache for node filter #10288

Node table limiting and cache for node filter #10288

Conversation

VladLupashevskyi commented Feb 3, 2019

parity-cla-bot commented Feb 3, 2019

parity-cla-bot commented Feb 3, 2019

HCastano left a comment

Choose a reason for hiding this comment

ascjones left a comment

Choose a reason for hiding this comment

grbIzl Feb 12, 2019

Choose a reason for hiding this comment

VladLupashevskyi Feb 12, 2019

Choose a reason for hiding this comment

grbIzl Feb 12, 2019

Choose a reason for hiding this comment

VladLupashevskyi Feb 12, 2019 • edited Loading

Choose a reason for hiding this comment

VladLupashevskyi Feb 12, 2019

Choose a reason for hiding this comment

ngotchac Apr 3, 2019

Choose a reason for hiding this comment

grbIzl commented Feb 12, 2019

VladLupashevskyi commented Feb 12, 2019

grbIzl commented Feb 12, 2019 • edited Loading

VladLupashevskyi commented Feb 19, 2019

VladLupashevskyi commented Feb 19, 2019

VladLupashevskyi commented Mar 19, 2019

grbIzl left a comment

Choose a reason for hiding this comment

VladLupashevskyi Feb 12, 2019 •

edited

Loading

grbIzl commented Feb 12, 2019 •

edited

Loading