Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel hashing #20488

Merged
merged 3 commits into from
Feb 4, 2020
Merged

Parallel hashing #20488

merged 3 commits into from
Feb 4, 2020

Conversation

holiman
Copy link
Contributor

@holiman holiman commented Dec 28, 2019

This is a follow-up to #20481 , which implements

  • Counting updates to the trie,
  • Paralellizing the hashing operation, if the number of changes exceed 99 items.

The benchmarks below are against #20481

[user@work trie]$ benchstat serial_hash.txt paralell_hash.txt 
name                  old time/op    new time/op    delta
Hash-6                  6.56µs ±16%    4.34µs ± 3%  -33.79%  (p=0.016 n=5+4)
HashFixedSize/10-6      63.1µs ± 1%    63.6µs ± 3%     ~     (p=0.310 n=5+5)
HashFixedSize/100-6      295µs ± 1%     257µs ± 1%  -12.74%  (p=0.008 n=5+5)
HashFixedSize/1K-6      3.41ms ± 9%    2.43ms ± 8%  -28.93%  (p=0.008 n=5+5)
HashFixedSize/10K-6     39.3ms ± 7%    28.1ms ± 5%  -28.42%  (p=0.008 n=5+5)
HashFixedSize/100K-6     449ms ±15%     295ms ± 9%  -34.25%  (p=0.008 n=5+5)

name                  old alloc/op   new alloc/op   delta
Hash-6                    911B ± 3%      914B ± 0%     ~     (p=0.135 n=5+5)
HashFixedSize/10-6      12.5kB ± 0%    12.6kB ± 0%   +0.85%  (p=0.008 n=5+5)
HashFixedSize/100-6     58.6kB ± 0%    59.3kB ± 0%   +1.04%  (p=0.008 n=5+5)
HashFixedSize/1K-6       631kB ± 0%     639kB ± 0%   +1.20%  (p=0.008 n=5+5)
HashFixedSize/10K-6     6.57MB ± 0%    6.65MB ± 0%   +1.16%  (p=0.008 n=5+5)
HashFixedSize/100K-6    64.9MB ± 0%    65.5MB ± 0%   +0.95%  (p=0.008 n=5+5)

name                  old allocs/op  new allocs/op  delta
Hash-6                    11.0 ± 0%      12.0 ± 0%   +9.09%  (p=0.008 n=5+5)
HashFixedSize/10-6         162 ± 0%       176 ± 0%   +8.64%  (p=0.008 n=5+5)
HashFixedSize/100-6        786 ± 0%       850 ± 0%   +8.14%  (p=0.008 n=5+5)
HashFixedSize/1K-6       8.15k ± 0%     8.87k ± 0%   +8.81%  (p=0.008 n=5+5)
HashFixedSize/10K-6      83.4k ± 0%     91.1k ± 0%   +9.17%  (p=0.008 n=5+5)
HashFixedSize/100K-6      830k ± 0%      905k ± 0%   +8.96%  (p=0.008 n=5+5)

Here are the benchmarks against master:

[user@work trie]$ benchstat hashbefore.txt paralell_hash.txt 
name                  old time/op    new time/op    delta
Hash-6                  7.08µs ±12%    4.34µs ± 3%  -38.69%  (p=0.002 n=10+4)
HashFixedSize/10-6      70.0µs ± 2%    63.6µs ± 3%   -9.19%  (p=0.001 n=10+5)
HashFixedSize/100-6      328µs ± 3%     257µs ± 1%  -21.56%  (p=0.001 n=10+5)
HashFixedSize/1K-6      3.67ms ± 3%    2.43ms ± 8%  -33.81%  (p=0.001 n=9+5)
HashFixedSize/10K-6     43.3ms ± 2%    28.1ms ± 5%  -35.09%  (p=0.001 n=9+5)
HashFixedSize/100K-6     525ms ±12%     295ms ± 9%  -43.76%  (p=0.001 n=10+5)

name                  old alloc/op   new alloc/op   delta
Hash-6                    989B ± 1%      914B ± 0%   -7.54%  (p=0.001 n=10+5)
HashFixedSize/10-6      13.8kB ± 0%    12.6kB ± 0%   -8.50%  (p=0.001 n=10+5)
HashFixedSize/100-6     65.0kB ± 0%    59.3kB ± 0%   -8.90%  (p=0.001 n=10+5)
HashFixedSize/1K-6       695kB ± 0%     639kB ± 0%   -8.14%  (p=0.001 n=10+5)
HashFixedSize/10K-6     7.21MB ± 0%    6.65MB ± 0%   -7.81%  (p=0.001 n=10+5)
HashFixedSize/100K-6    71.3MB ± 0%    65.5MB ± 0%   -8.11%  (p=0.001 n=10+5)

name                  old allocs/op  new allocs/op  delta
Hash-6                    12.0 ± 0%      12.0 ± 0%     ~     (all equal)
HashFixedSize/10-6         182 ± 0%       176 ± 0%   -3.30%  (p=0.000 n=10+5)
HashFixedSize/100-6        886 ± 0%       850 ± 0%   -4.06%  (p=0.000 n=10+5)
HashFixedSize/1K-6       9.16k ± 0%     8.87k ± 0%   -3.14%  (p=0.000 n=8+5)
HashFixedSize/10K-6      93.5k ± 0%     91.1k ± 0%   -2.63%  (p=0.000 n=10+5)
HashFixedSize/100K-6      932k ± 0%      905k ± 0%   -2.89%  (p=0.001 n=10+5)

Once the current benchmark run is done, I'll put this one up

@holiman holiman changed the title Paralell commit Paralell hashing Dec 30, 2019
@holiman
Copy link
Contributor Author

holiman commented Dec 31, 2019

@holiman
Copy link
Contributor Author

holiman commented Jan 1, 2020

Some preliminary charts. So, this PR only does paralell hashing if the number of updates is 100 or more, so as you can see, it has basically no effect pre-byzantium (when we hash in between every tx). However, the effect post-byz is very visible:

Screenshot_2020-01-01 Dual Geth - Grafana

Here's a bit more zoom = this PR is pretty stable at about 1.9ms, and master varies between 2.5 - 5 ms

Screenshot_2020-01-01 Dual Geth - Grafana(1)

@holiman
Copy link
Contributor Author

holiman commented Jan 1, 2020

Oh wow - account hash has now moved down from number 2:
Screenshot_2020-01-01 Dual Geth - Grafana(2)

@holiman
Copy link
Contributor Author

holiman commented Jan 2, 2020

These are the first 5.8M blocks (bottom is this PR).
Screenshot_2020-01-02 Dual Geth - Grafana

In total, account hash has dropped to seventh place, from fourth.
Here's the account hash individually
Screenshot_2020-01-02 Dual Geth - Grafana(1)

@holiman
Copy link
Contributor Author

holiman commented Jan 4, 2020

After 4 days (96h), this PR is roughly 9h ahead of master (around block 6.881M). On #20481 , which is this PR minus the paralell hashing, that number didn't appear until later:

master reached block 8.6M about 9h25m later than this PR.

@holiman
Copy link
Contributor Author

holiman commented Jan 5, 2020

After ~5 days, 118h, it's 10h 40m ahead.

@holiman
Copy link
Contributor Author

holiman commented Jan 5, 2020

The other one (#20493) , running on 06/07, is roughly 17h30m ahead after 116h. I don't think it's because that one is so much faster than this one, but merely that 06 (master) is a lot slower than 07 (experimental). The true number for the improvement is probably somewhere in between.

@gballet gballet changed the title Paralell hashing Parallel hashing Jan 6, 2020
if h.parallel {
var wg sync.WaitGroup
wg.Add(16)
for i := 0; i < 16; i++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably want to do runtime.NumCPU or runtime.GOMAXPROCS here instead of hard coding 16. Or maybe tke the minimum of the two. If I have only 4 cores available, I don't think running on 16 threads helps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, you're hashing each child on a separate goroutine. Hmm, what if the first full node is not that dense. Though to be honest, it probably is the densest apart from attacker tries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so this branches off at the topmost fullnode, which in all likelihood will be pretty well balanced and dense

trie/trie.go Outdated Show resolved Hide resolved
trie/trie.go Outdated Show resolved Hide resolved
trie/trie.go Outdated Show resolved Hide resolved
trie/trie.go Outdated Show resolved Hide resolved
trie/trie.go Outdated
@@ -47,6 +48,10 @@ type LeafCallback func(leaf []byte, parent common.Hash) error
type Trie struct {
db *Database
root node
// Keep a rough track of the number of leafs to commit
dirtyCount int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might also add to the docs that the statedb deduplicates writes and only does a trie update when the data is actually pushed to the database, so this counter will actually be pretty precise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be pretty precise, but I'm increasing it for deletions aswell, so that will cause some skew

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably drop this field if it's not used (yet)

trie/pure_hasher.go Outdated Show resolved Hide resolved
trie/pure_hasher.go Outdated Show resolved Hide resolved
@holiman
Copy link
Contributor Author

holiman commented Jan 6, 2020

Let's do #20481 first, and once that one is 'done', I'll rebase this and fix the hasher-specific things

@holiman
Copy link
Contributor Author

holiman commented Jan 7, 2020

When this was executed without the paralell hashing, this was the result:

Mon08 (this PR) finished full sync at block 9188701 at 2019-12-31 03:05:02, after a total of 175h
mon09 was 10 hours behind, at block 9.177M, when the run was stopped.

With the addition of parallel hashing, the new results are, it hit 9188754 at Jan 07 13:45:20, starting at Dec-31, 12:47. So roughly 7 x 24-1 hours= 167h, shaving off another 8 hours from #20481. It looks to be somewhere around 16h ahead of master on mon09.

@holiman
Copy link
Contributor Author

holiman commented Jan 8, 2020

On mon07, this (almost same PR) finished in roughly 177h (mon06 way way behind)

@holiman
Copy link
Contributor Author

holiman commented Feb 3, 2020

Rebased on top of master -- PTAL @karalabe

trie/trie.go Outdated Show resolved Hide resolved
@karalabe karalabe added this to the 1.9.11 milestone Feb 4, 2020
Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

@karalabe karalabe merged commit a1313b5 into ethereum:master Feb 4, 2020
@holiman holiman deleted the paralell_commit branch February 4, 2020 14:16
enriquefynn pushed a commit to enriquefynn/go-ethereum that referenced this pull request Mar 10, 2021
…#20488)

* trie: make hasher parallel when number of changes are large

* trie: remove unused field dirtyCount

* trie: rename unhashedCount/unhashed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants