Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

benchmark #143

Merged
merged 19 commits into from
Apr 1, 2024
Merged

benchmark #143

merged 19 commits into from
Apr 1, 2024

Conversation

yamt
Copy link
Owner

@yamt yamt commented Feb 6, 2024

No description provided.

yamt added 16 commits March 11, 2024 22:24
```
python3 plot.py
```
```
python3 plot.py
```
for some reasons, recent versions of wasmer reject relative paths
for --dir.
cf. wasmerio/wasmer@bd34c53
```
awk -f tocsv-ffmpeg.awk ffmpeg.txt > ffmpeg.csv
```
```
python3 plot-ffmpeg.py
```
the recent "wasmtime --version" output looks like
"wasmtime-cli 18.0.2 (90db6e99f 2024-02-28)"
```
awk -f tocsv-ffmpeg.awk ffmpeg.txt > ffmpeg.csv
```
```
python3 plot-ffmpeg.py
```
@yamt yamt mentioned this pull request Mar 11, 2024
the lazy compilation (`-DWAMR_BUILD_LAZY_JIT=0`) doesn't make a much
difference as I expected. I'm not sure why.
Probably it's because of [a naive locking](https://github.com/bytecodealliance/wasm-micro-runtime/issues/2499).
Unlike wasm3 and wasmi, it doesn't defer the validation though.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be emulated with Wasmi, too via --compilation-mode lazy-translation. Though I generally wouldn't recommend using that since it is just roughly 50% faster than eager translation but has all the downsides of lazy translation.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. i didn't know the option.
personally i like lazy-translation mode because it wouldn't change visible behaviors from eager.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also like it a lot which is why I introduced both lazy variants. But in practical use cases it often happens that inputs are pre-validated so the lazy option is as safe and much more efficient than the lazy-translation option.

Comment on lines -60 to 62
git checkout v0.31.0
git checkout v0.32.0-beta.7
cargo build --profile bench
cp target/release/wasmi_cli ~/bin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With respect to my comment about unhopeful inefficiencies during runtime it seems Wasmi got installed properly by the script with all needed optimizations. One thing that never occurred to me before was how important march=native could be on some of the hardware systems to avoid the mentioned inefficiencies in #8 (comment).

Copy link
Contributor

@Robbepop Robbepop Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is Cargo's -C target-cpu=native option. At least according to this article (https://vfoley.xyz/rust-compilation-tip/) it can sometimes help. E.g. via RUSTFLAGS="-C target-cpu=native"; cargo +stable build --profile bench -p wasmi_cli.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting. i might try it when i have a chance.
however, for those runtimes i built myself here, i didn't apply such optimizations to any of them. (unless it's the default, which i haven't investigated.)
do you think wasmi is more sensitive to this kind of optimizations than other runtimes?

Copy link
Contributor

@Robbepop Robbepop Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish I had a machine where Wasmi runs badly on so I could test it on my own. I wouldn't want Wasmi to have special optimization flags that no other runtime needs, rather I would like to understand the reason and be able to make Wasmi optimizations more robust if only I knew what was causing the issues in the first place. 🙃

do you think wasmi is more sensitive to this kind of optimizations than other runtimes?

Yes, I think a big reason for this is the missing control in the engine's instruction dispatch (hot path) due to missing control over the codegen with only a loop+match construct instead of something more elaborate such as tail call dispatching. Unfortunately Rust does not support tail calls or any other usable form of dispatching such as computed goto etc.
So the best we can do today in Wasmi is to pray that the LLVM optimizer heuristics get the instruction dispatch codegen right.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RUSTFLAGS="-C target-cpu=native"; cargo +stable build --profile bench -p wasmi_cli

i tried this. i haven't noticed any improvements.

spacetanuki% ./test/run-ffmpeg.sh /usr/bin/time -l wasmi_cli.target-cpu-native -
-dir .video --                                                                  
executing File(".ffmpeg/ffmpeg.wasm")::_start() ... 
       32.14 real        32.08 user         0.04 sys
           112017408  maximum resident set size      
                   0  average shared memory size 
                   0  average unshared data size                                
                   0  average unshared stack size                               
               30677  page reclaims                                             
                   0  page faults                                               
                   0  swaps                                                     
                   0  block input operations                                    
                   0  block output operations                                   
                   0  messages sent    
                   0  messages received                                         
                   0  signals received                                          
                   0  voluntary context switches  
                 660  involuntary context switches                              
        254158714732  instructions retired                                      
        129063009419  cycles elapsed                                            
            90820608  peak memory footprint                                     
spacetanuki% 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's sad that the new Wasmi runs so poorly on your machine but I see it as my job to improve its performance on all platforms if that's possible for me. Thank you for taking all your time and feedback!

@@ -1,15 +1,16 @@
toywasm (default),0.17,0.16,0.00,40300544
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw. if you rename this CVS file into startup.cvs (akin to ffmpeg.cvs) then GitHub can render it as such which is more readable. :)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean csv?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes sorry

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least for me GitHub beautifully renders your ffmpeg.csv.
image

@yamt yamt marked this pull request as ready for review March 17, 2024 04:01
@yamt yamt merged commit 1c171d2 into master Apr 1, 2024
111 of 116 checks passed
@yamt yamt deleted the bench4 branch April 1, 2024 12:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants