Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add benchmarks for wasmi builtin metering #56

Merged
merged 2 commits into from
Mar 6, 2023
Merged

Conversation

athei
Copy link
Member

@athei athei commented Mar 5, 2023

  • Upgraded to newest wasmi
  • Refactored benchmarks
  • Two new benchmark strategies (no_metering and wasmi_builtin)

We can now benchmark the execution of modules using our two instrumentation strategies in addition to no metering (as a baseline) and wasmi's builtin metering.

We can learn from the following results (ran on my M1) that the builtin metering decisively outperforms the instrumentation on every single fixture.

cc @Robbepop

coremark/no_metering                    [15.586 s 15.588 s 15.589 s]
coremark/wasmi_builtin                  [16.403 s 16.414 s 16.434 s]
coremark/host_function                  [18.245 s 18.248 s 18.252 s]
coremark/mutable_global                 [20.476 s 20.486 s 20.505 s]

recursive_ok/no_metering                [111.32 µs 111.33 µs 111.34 µs]
recursive_ok/wasmi_builtin              [138.64 µs 138.65 µs 138.66 µs]
recursive_ok/host_function              [495.55 µs 495.64 µs 495.78 µs]
recursive_ok/mutable_global             [514.07 µs 514.09 µs 514.11 µs]

fibonacci_recursive/no_metering         [3.9098 µs 3.9102 µs 3.9108 µs]
fibonacci_recursive/wasmi_builtin       [4.3242 µs 4.3246 µs 4.3250 µs]
fibonacci_recursive/host_function       [12.913 µs 12.914 µs 12.915 µs]
fibonacci_recursive/mutable_global      [13.202 µs 13.208 µs 13.212 µs]
              
factorial_recursive/no_metering         [530.72 ns 530.84 ns 530.91 ns]
factorial_recursive/wasmi_builtin       [619.17 ns 619.30 ns 619.44 ns]
factorial_recursive/host_function       [1.7656 µs 1.7657 µs 1.7659 µs]
factorial_recursive/mutable_global      [1.8783 µs 1.8786 µs 1.8788 µs]

count_until/no_metering                 [1.2422 ms 1.2423 ms 1.2424 ms]
count_until/wasmi_builtin               [1.3976 ms 1.3978 ms 1.3981 ms]
count_until/host_function               [4.8074 ms 4.8106 ms 4.8125 ms]
count_until/mutable_global              [5.9161 ms 5.9169 ms 5.9182 ms]

memory_vec_add/no_metering              [4.1630 ms 4.1638 ms 4.1648 ms]
memory_vec_add/wasmi_builtin            [4.3913 ms 4.3925 ms 4.3930 ms]
memory_vec_add/host_function            [8.2925 ms 8.2949 ms 8.2967 ms]
memory_vec_add/mutable_global           [9.1124 ms 9.1152 ms 9.1163 ms]

wasm_kernel::tiny_keccak/no_metering    [613.21 µs 613.42 µs 613.58 µs]
wasm_kernel::tiny_keccak/wasmi_builtin  [617.04 µs 617.46 µs 617.81 µs]
wasm_kernel::tiny_keccak/host_function  [817.24 µs 817.44 µs 817.89 µs]
wasm_kernel::tiny_keccak/mutable_global [873.42 µs 873.90 µs 874.65 µs]

global_bump/no_metering                 [1.4597 ms 1.4598 ms 1.4600 ms]
global_bump/wasmi_builtin               [1.6151 ms 1.6152 ms 1.6153 ms]
global_bump/host_function               [5.5393 ms 5.5418 ms 5.5435 ms]
global_bump/mutable_global              [6.9446 ms 6.9454 ms 6.9461 ms]

@athei athei requested a review from pepyakin as a code owner March 5, 2023 16:30
@Robbepop
Copy link

Robbepop commented Mar 6, 2023

@athei Thanks a lot for those benchmarks. Very exciting to see so massive speed-ups from the built-in fuel metering. 🚀

@athei athei merged commit 3ba9c2c into master Mar 6, 2023
@athei athei deleted the at/bench-wasmi-gas branch March 6, 2023 21:06
@athei
Copy link
Member Author

athei commented Mar 6, 2023

Yes great work there with your wasmi implementation! I just wanted to get a feel for it before putting into pallet-contracts. Cause we can't see the speedup there from the weight benchmarks as we don't inject metering when running benchmarks. We charge the metering overhead every time the contract calls the gas host function to account for a basic block.

We should notice the gas usage improvements in our benchmarks we house in the ink! repo.

@agryaznov
Copy link
Contributor

Benchmark results: wasmi 0.30 vs wasmi 0.29

wasmi 0.29

Results on Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz
coremark/no_metering                    time:   [15.276 s 15.277 s 15.278 s]
coremark/wasmi_builtin                  time:   [15.582 s 15.584 s 15.585 s]
coremark/host_function                  time:   [18.378 s 18.379 s 18.381 s]
coremark/mutable_global                 time:   [17.349 s 17.350 s 17.351 s]
recursive_ok/no_metering                time:   [179.34 µs 179.36 µs 179.37 µs]
recursive_ok/wasmi_builtin              time:   [161.57 µs 161.58 µs 161.59 µs]
recursive_ok/host_function              time:   [736.83 µs 736.88 µs 736.94 µs]
recursive_ok/mutable_global             time:   [853.54 µs 853.62 µs 853.68 µs]
fibonacci_recursive/no_metering         time:   [5.2200 µs 5.2206 µs 5.2209 µs]
fibonacci_recursive/wasmi_builtin       time:   [5.3340 µs 5.3359 µs 5.3373 µs]
fibonacci_recursive/host_function       time:   [18.631 µs 18.633 µs 18.635 µs]
fibonacci_recursive/mutable_global      time:   [21.951 µs 21.954 µs 21.955 µs]
factorial_recursive/no_metering         time:   [788.51 ns 788.85 ns 789.36 ns]
factorial_recursive/wasmi_builtin       time:   [713.57 ns 714.10 ns 714.55 ns]
factorial_recursive/host_function       time:   [2.5471 µs 2.5477 µs 2.5482 µs]
factorial_recursive/mutable_global      time:   [2.9971 µs 2.9973 µs 2.9975 µs]
count_until/no_metering                 time:   [821.09 µs 821.13 µs 821.17 µs]
count_until/wasmi_builtin               time:   [862.38 µs 862.43 µs 862.48 µs]
count_until/host_function               time:   [7.4901 ms 7.4922 ms 7.4946 ms]
count_until/mutable_global              time:   [7.3124 ms 7.3142 ms 7.3158 ms]
memory_vec_add/no_metering              time:   [2.9366 ms 2.9367 ms 2.9369 ms]
memory_vec_add/wasmi_builtin            time:   [3.1366 ms 3.1368 ms 3.1369 ms]
memory_vec_add/host_function            time:   [10.155 ms 10.163 ms 10.171 ms]
memory_vec_add/mutable_global           time:   [9.2684 ms 9.2693 ms 9.2706 ms]
wasm_kernel::tiny_keccak/no_metering    time:   [1.0866 ms 1.0866 ms 1.0867 ms]
wasm_kernel::tiny_keccak/wasmi_builtin  time:   [1.1073 ms 1.1076 ms 1.1079 ms]
wasm_kernel::tiny_keccak/host_function  time:   [1.4415 ms 1.4417 ms 1.4419 ms]
wasm_kernel::tiny_keccak/mutable_global time:   [1.4309 ms 1.4310 ms 1.4312 ms]
global_bump/no_metering                 time:   [900.12 µs 900.20 µs 900.28 µs]
global_bump/wasmi_builtin               time:   [1.0166 ms 1.0168 ms 1.0169 ms]
global_bump/host_function               time:   [8.7865 ms 8.7926 ms 8.7993 ms]
global_bump/mutable_global              time:   [8.0529 ms 8.0534 ms 8.0541 ms]

wasmi 0.30

tl;dr: performance improvement is being shown on all benchmarks except this one:

wasm_kernel::tiny_keccak/wasmi_builtin  time:   [1.1812 ms 1.1813 ms 1.1813 ms]
                                        change: [+6.6003% +6.6130% +6.6278%] (p = 0.00 < 0.05)

Results on Intel(R) Xeon(R) CPU E3-1245 V2 @ 3.40GHz
coremark/no_metering                    time:   [14.102 s 14.104 s 14.107 s]
                                        change: [-5.9689% -5.9522% -5.9297%] (p = 0.00 < 0.05)
coremark/wasmi_builtin                  time:   [14.592 s 14.593 s 14.594 s]
                                        change: [-2.3238% -2.3172% -2.3106%] (p = 0.00 < 0.05)
coremark/host_function                  time:   [19.246 s 19.248 s 19.249 s]
                                        change: [+1.8645% +1.8968% +1.9343%] (p = 0.00 < 0.05)
coremark/mutable_global                 time:   [15.336 s 15.636 s 16.235 s]
                                        change: [-10.316% -8.5610% -5.0573%] (p = 0.00 < 0.05)
recursive_ok/no_metering                time:   [121.58 µs 121.60 µs 121.62 µs]
                                        change: [-32.490% -32.478% -32.468%] (p = 0.00 < 0.05)
recursive_ok/wasmi_builtin              time:   [134.55 µs 134.97 µs 135.31 µs]
                                        change: [-16.696% -16.584% -16.424%] (p = 0.00 < 0.05)
recursive_ok/host_function              time:   [743.91 µs 744.00 µs 744.08 µs]
                                        change: [-3.8308% -3.7964% -3.7600%] (p = 0.00 < 0.05)
recursive_ok/mutable_global             time:   [572.18 µs 572.25 µs 572.29 µs]
                                        change: [-32.496% -32.484% -32.474%] (p = 0.00 < 0.05)
fibonacci_recursive/no_metering         time:   [4.4217 µs 4.4222 µs 4.4228 µs]
                                        change: [-18.001% -17.981% -17.963%] (p = 0.00 < 0.05)
fibonacci_recursive/wasmi_builtin       time:   [4.6848 µs 4.6856 µs 4.6869 µs]
                                        change: [-11.986% -11.943% -11.890%] (p = 0.00 < 0.05)
fibonacci_recursive/host_function       time:   [18.474 µs 18.477 µs 18.479 µs]
                                        change: [-5.5537% -5.5422% -5.5298%] (p = 0.00 < 0.05)
fibonacci_recursive/mutable_global      time:   [14.347 µs 14.348 µs 14.350 µs]
                                        change: [-31.561% -31.539% -31.518%] (p = 0.00 < 0.05)
factorial_recursive/no_metering         time:   [587.39 ns 587.48 ns 587.57 ns]
                                        change: [-25.329% -25.311% -25.296%] (p = 0.00 < 0.05)
factorial_recursive/wasmi_builtin       time:   [621.17 ns 621.20 ns 621.27 ns]
                                        change: [-15.278% -15.249% -15.219%] (p = 0.00 < 0.05)
factorial_recursive/host_function       time:   [2.5011 µs 2.5014 µs 2.5018 µs]
                                        change: [-3.4296% -3.3506% -3.2977%] (p = 0.00 < 0.05)
factorial_recursive/mutable_global      time:   [2.0351 µs 2.0355 µs 2.0359 µs]
                                        change: [-31.107% -31.083% -31.058%] (p = 0.00 < 0.05)
count_until/no_metering                 time:   [779.33 µs 779.37 µs 779.43 µs]
                                        change: [-6.3625% -6.3475% -6.3344%] (p = 0.00 < 0.05)
count_until/wasmi_builtin               time:   [836.27 µs 836.34 µs 836.41 µs]
                                        change: [-3.7705% -3.7604% -3.7505%] (p = 0.00 < 0.05)
count_until/host_function               time:   [8.1103 ms 8.1106 ms 8.1110 ms]
                                        change: [+8.3961% +8.4094% +8.4214%] (p = 0.00 < 0.05)
count_until/mutable_global              time:   [6.4612 ms 6.4616 ms 6.4621 ms]
                                        change: [-10.096% -10.045% -9.9952%] (p = 0.00 < 0.05)
memory_vec_add/no_metering              time:   [4.8664 ms 4.8667 ms 4.8671 ms]
                                        change: [+72.243% +72.253% +72.264%] (p = 0.00 < 0.05)
memory_vec_add/wasmi_builtin            time:   [2.8494 ms 2.8498 ms 2.8502 ms]
                                        change: [-1.8772% -1.8557% -1.8365%] (p = 0.00 < 0.05)
memory_vec_add/host_function            time:   [12.171 ms 12.173 ms 12.173 ms]
                                        change: [+19.651% +19.912% +20.166%] (p = 0.00 < 0.05)
memory_vec_add/mutable_global           time:   [8.4185 ms 8.4194 ms 8.4205 ms]
                                        change: [-9.7563% -9.5375% -9.2263%] (p = 0.00 < 0.05)
wasm_kernel::tiny_keccak/no_metering    time:   [1.1019 ms 1.1020 ms 1.1021 ms]
                                        change: [+0.3215% +0.3397% +0.3574%] (p = 0.00 < 0.05)
wasm_kernel::tiny_keccak/wasmi_builtin  time:   [1.1812 ms 1.1813 ms 1.1813 ms]
                                        change: [+6.6003% +6.6130% +6.6278%] (p = 0.00 < 0.05)
wasm_kernel::tiny_keccak/host_function  time:   [1.5266 ms 1.5267 ms 1.5267 ms]
                                        change: [+5.7885% +5.8053% +5.8301%] (p = 0.00 < 0.05)
wasm_kernel::tiny_keccak/mutable_global time:   [1.4203 ms 1.4205 ms 1.4208 ms]
                                        change: [-1.1949% -1.0907% -0.9918%] (p = 0.00 < 0.05)
global_bump/no_metering                 time:   [938.07 µs 938.17 µs 938.27 µs]
                                        change: [+4.0925% +4.1085% +4.1266%] (p = 0.00 < 0.05)
global_bump/wasmi_builtin               time:   [940.70 µs 940.74 µs 940.79 µs]
                                        change: [-6.4598% -6.4542% -6.4486%] (p = 0.00 < 0.05)
global_bump/host_function               time:   [8.9666 ms 8.9695 ms 8.9714 ms]
                                        change: [+1.7218% +1.7422% +1.7584%] (p = 0.00 < 0.05)
global_bump/mutable_global              time:   [7.3741 ms 7.3746 ms 7.3753 ms]
                                        change: [-10.942% -10.925% -10.910%] (p = 0.00 < 0.05)

ukint-vs pushed a commit to gear-tech/wasm-instrument that referenced this pull request Oct 3, 2024
* Upgraded to newest wasmi
* Refactored benchmarks
* Two new benchmark strategies (`no_metering` and `wasmi_builtin`)

We can now benchmark the execution of modules using our two
instrumentation strategies in addition to no metering (as a baseline)
and wasmi's builtin metering.

We can learn from the following results (ran on my M1) that the builtin
metering decisively outperforms the instrumentation on every single
fixture.

cc @Robbepop 

```
coremark/no_metering                    [15.586 s 15.588 s 15.589 s]
coremark/wasmi_builtin                  [16.403 s 16.414 s 16.434 s]
coremark/host_function                  [18.245 s 18.248 s 18.252 s]
coremark/mutable_global                 [20.476 s 20.486 s 20.505 s]

recursive_ok/no_metering                [111.32 µs 111.33 µs 111.34 µs]
recursive_ok/wasmi_builtin              [138.64 µs 138.65 µs 138.66 µs]
recursive_ok/host_function              [495.55 µs 495.64 µs 495.78 µs]
recursive_ok/mutable_global             [514.07 µs 514.09 µs 514.11 µs]

fibonacci_recursive/no_metering         [3.9098 µs 3.9102 µs 3.9108 µs]
fibonacci_recursive/wasmi_builtin       [4.3242 µs 4.3246 µs 4.3250 µs]
fibonacci_recursive/host_function       [12.913 µs 12.914 µs 12.915 µs]
fibonacci_recursive/mutable_global      [13.202 µs 13.208 µs 13.212 µs]
              
factorial_recursive/no_metering         [530.72 ns 530.84 ns 530.91 ns]
factorial_recursive/wasmi_builtin       [619.17 ns 619.30 ns 619.44 ns]
factorial_recursive/host_function       [1.7656 µs 1.7657 µs 1.7659 µs]
factorial_recursive/mutable_global      [1.8783 µs 1.8786 µs 1.8788 µs]

count_until/no_metering                 [1.2422 ms 1.2423 ms 1.2424 ms]
count_until/wasmi_builtin               [1.3976 ms 1.3978 ms 1.3981 ms]
count_until/host_function               [4.8074 ms 4.8106 ms 4.8125 ms]
count_until/mutable_global              [5.9161 ms 5.9169 ms 5.9182 ms]

memory_vec_add/no_metering              [4.1630 ms 4.1638 ms 4.1648 ms]
memory_vec_add/wasmi_builtin            [4.3913 ms 4.3925 ms 4.3930 ms]
memory_vec_add/host_function            [8.2925 ms 8.2949 ms 8.2967 ms]
memory_vec_add/mutable_global           [9.1124 ms 9.1152 ms 9.1163 ms]

wasm_kernel::tiny_keccak/no_metering    [613.21 µs 613.42 µs 613.58 µs]
wasm_kernel::tiny_keccak/wasmi_builtin  [617.04 µs 617.46 µs 617.81 µs]
wasm_kernel::tiny_keccak/host_function  [817.24 µs 817.44 µs 817.89 µs]
wasm_kernel::tiny_keccak/mutable_global [873.42 µs 873.90 µs 874.65 µs]

global_bump/no_metering                 [1.4597 ms 1.4598 ms 1.4600 ms]
global_bump/wasmi_builtin               [1.6151 ms 1.6152 ms 1.6153 ms]
global_bump/host_function               [5.5393 ms 5.5418 ms 5.5435 ms]
global_bump/mutable_global              [6.9446 ms 6.9454 ms 6.9461 ms]
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants