Skip to content

Commit

Permalink
Add benchmark inference_tract (#37)
Browse files Browse the repository at this point in the history
* Add benchmark inference_tract

* Fix build for mobile_net_v2_onnx Wasm and update tensorflow Wasm file

* Embedded the tensorflow model into the Wasm binary

* Update readme to reflect new inferencing benchmarks

* Update for pr comments
  • Loading branch information
jlb6740 authored Jul 9, 2024
1 parent a051965 commit 9dfac06
Show file tree
Hide file tree
Showing 36 changed files with 5,142 additions and 28 deletions.
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ ENV LD_LIBRARY_PATH=/usr/local/lib
ENV PATH=/usr/local/bin:$PATH
CMD ["/bin/bash"]
ENV DEBIAN_FRONTEND="noninteractive" TZ="America"
ARG RUST_VERSION="nightly-2023-04-01"
ARG RUST_VERSION="nightly-2024-06-09"
ARG WASMTIME_REPO="https://github.com/bytecodealliance/wasmtime/"
ARG WASMTIME_COMMIT="1bfe4b5" # v9.0.1
ARG WASMTIME_COMMIT="cedf9aa" # v21.0.1
ARG SIGHTGLASS_REPO="https://github.com/bytecodealliance/sightglass.git"
ARG SIGHTGLASS_BRANCH="main"
ARG SIGHTGLASS_COMMIT="e89fce0"
Expand Down
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# WasmScore

## Intro
WasmScore aims to benchmark platform performance when executing WebAssembly outside the browser. It leverages [Sightglass](https://github.com/bytecodealliance/sightglass) to run benchmarks and measure performance and then summarizes these results as both an execution score and an efficiency score. In addition to providing scores for the platform, the benchmark is also a tool capable of executing other tests, suites, or individual benchmarks supported by the driver. WasmScore is work in development.
WasmScore aims to benchmark platform performance when executing WebAssembly outside the browser. It leverages [Sightglass](https://github.com/bytecodealliance/sightglass) to run benchmarks and measure performance and then summarizes these results as both an execution score and an efficiency score. In addition to providing a general default scores for the platform, the benchmark is also capable of executing other individual or specialized tests, suites, or individual benchmarks supported by the driver. WasmScore development is still at early stages and is work in development.

## Description
A basic part of benchmarking is interpreting the results; should you consider the results to be good or bad? To decide, you need a baseline to serve as a point of comparison. For example, that baseline could be a measure of the performance before some code optimization was applied or before some configuration change was made to the runtime. In the case of WasmScore (specifically the wasmscore test) that baseline is the execution of the native code compiled from the same high-level source used to generate the Wasm. In this way the native execution of codes that serves as a comparison point for the Wasm performance also serves as an upper-bound for the performance of WebAssembly. This allows gauging the performance impact when using Wasm instead of a native compile of the same code. It also allows developers to find opportunities to improve compilers, or to improve Wasm runtimes, or improve the Wasm spec, or to suggest other solutions (such as Wasi) to address gaps.
A basic part of benchmarking is interpreting the results; should you consider the results to be good or bad? To decide, you need a baseline (or goal) to serve as a point of comparison. For example, that baseline could be a measure of the performance before some code optimization was applied or before some configuration change was made to the runtime. In the case of WasmScore (specifically the default score) that baseline is the execution of the native code compiled from the same high-level source used to generate the Wasm. The native execution of codes serves as the expected upper-bound for the performance of WebAssembly. This both informs the performance impact of targeting Wasm instead of native for compiled code and it also allows developers to find opportunities to improve compilers, to improve Wasm runtimes, to improve the Wasm spec, and/or to suggest other solutions (such as Wasi) to address gaps.

## Benchmarks
Typically a benchmark reports either the amount of work done over a constant amount of time or it reports the time taken to do a constant amount of work. The benchmarks here all do the later. The initial commit of the benchmarks available are pulled directly from Sightglass. How the benchmarks stored here are built and run do will depend on the external Sightglass revision being used
Typically a benchmark reports either the amount of work done over a constant amount of time or it reports the time taken to do a constant amount of work. The benchmarks here all do the later. The initial commit of the benchmarks available are pulled directly from Sightglass. How the benchmarks stored here are built and run depends on the Sightglass revision used.

Benchmarks are often categorized based on their purpose and origin. Two example buckets include (1) codes written with the original intent of being user facing and (2) codes written specifically to target benchmarking some important or commonly used code construct or platform component. WasmScore does not aim to favor one of these over the other as both are valuable and relevant in the evaluation of standalone Wasm depending on what you are trying to learn.

Expand All @@ -22,18 +22,17 @@ WasmScore aims to:
"wasmscore" is the initial and default test. It includes a mix of benchmarks for testing Wasm performance outside the browser. The test is a collection of several subtests:

### wasmscore (default):
- App: [Meshoptimizer]
- Core: [Ackermann', ‘Ctype', ‘Fibonacci]
- Crypto: [Base64', ‘Ed25519', ‘Seqhash']
- AI: (Coming)
- App: [meshoptimizer]
- Core: [ackermann', ‘ctype', ‘fibonacci]
- Crypto: [base64', ‘ed25519', ‘seqhash']
- AI: ['tract_mobilenet_v2_onnx', 'tract_mobilenet_v2_tensorflow']
- Regex: (Coming)

## 2024 Q1 Goals
## 2024 Q3 Goals
Next steps include:
- Improving stability and user experience
- Adding benchmarks to the AI, Regex, and App suites
- Adding more benchmarks (including w/native build support)
- Complete the "simdscore" test
- Publish a list of planned milestone with corresponding releases

## Usage
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/Dockerfile.rust
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM rust:1.70
FROM rust:1.75
RUN rustup target add wasm32-wasi
WORKDIR /usr/src
ADD rust-benchmark rust-benchmark
Expand Down
2 changes: 1 addition & 1 deletion benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Benchmarks

The set of benchmarks here have been copied from [Sightglass](https://github.com/bytecodealliance/sightglass/benchmarks). In general, the benchmarks here and will mostly be consistent with the set of benchmarks in that repository.
The set of benchmarks here have been copied from [Sightglass](https://github.com/bytecodealliance/sightglass/benchmarks). The benchmarks here and will mostly be consistent with the set of benchmarks in that repository.
1 change: 1 addition & 0 deletions benchmarks/all.suite
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ blind-sig/benchmark.wasm
bz2/benchmark.wasm
hex-simd/benchmark.wasm
# image-classification/image-classification-benchmark.wasm
inference_tract/benchmark.wasm
intgemm-simd/benchmark.wasm
libsodium/libsodium-aead_aes256gcm2.wasm
libsodium/libsodium-aead_aes256gcm.wasm
Expand Down
10 changes: 6 additions & 4 deletions benchmarks/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,16 +38,18 @@ print_header "Build benchmarks"
CONTAINER_ID=$(set -x; docker create $IMAGE_NAME)
(set -x; docker cp $CONTAINER_ID:/benchmark/. $TMP_BENCHMARK)

# Verify benchmark is a valid Sightglass benchmark.
print_header "Verify benchmark"
# Copy benchmark.
print_header "Copy benchmark"
# From https://stackoverflow.com/a/246128:
SCRIPT_DIR="$( cd -- "$( dirname -- "${BASH_SOURCE[0]:-$0}"; )" &> /dev/null && pwd 2> /dev/null; )";
SIGHTGLASS_CARGO_TOML=$(dirname $SCRIPT_DIR)/Cargo.toml
for WASM in $TMP_BENCHMARK/*.wasm; do
(set -x; cargo run --manifest-path $SIGHTGLASS_CARGO_TOML --quiet -- validate $WASM)
(set -x; mv $WASM $BENCHMARK_DIR/)
done;

for MODEL in $TMP_BENCHMARK/*.pb; do
(set -x; mv $MODEL $BENCHMARK_DIR/)
done;

# Clean up.
print_header "Clean up"
(set -x; rm $TMP_TAR)
Expand Down
Binary file not shown.
23 changes: 23 additions & 0 deletions benchmarks/inference_tract/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
FROM rust:1.78 AS builder
RUN rustup target add wasm32-wasi
RUN mkdir /benchmark
WORKDIR /usr/src

# Compile mobile_net_v2_onnx
ADD mobile_net_v2_onnx rust-benchmark
WORKDIR /usr/src/rust-benchmark
ENV CARGO_REGISTRIES_CRATES_IO_PROTOCOL=sparse
RUN (cd mobile_net_v2_onnx; cargo build --release --target wasm32-wasi)
RUN cp target/wasm32-wasi/release/*benchmark.wasm /benchmark/mobile_net_v2_onnx_benchmark.wasm
WORKDIR /usr/src
RUN rm -rf rust-benchmark


# Compile mobile_net_v2_tensorflow
ADD mobile_net_v2_tensorflow rust-benchmark
WORKDIR /usr/src/rust-benchmark
RUN (cd mobile_net_v2_tensorflow; cargo build --release --target wasm32-wasi)
RUN cp target/wasm32-wasi/release/*benchmark.wasm /benchmark/mobile_net_v2_tensorflow_benchmark.wasm
RUN cp assets/mobilenet_v2_1.4_224_frozen.pb /benchmark/mobilenet_v2_1.4_224_frozen.pb
WORKDIR /usr/src
RUN rm -rf rust-benchmark
7 changes: 7 additions & 0 deletions benchmarks/inference_tract/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Image Classification Wasmtime Benchmark

A benchmark that runs an image classifier in pure Wasm. This can be used to
benchmark the performance of float heavy computations.

Note that the classifier model is not included in the repo because it is large
and is instead downloaded if needed when running the `setup.sh` script.
13 changes: 13 additions & 0 deletions benchmarks/inference_tract/build_mobile_net_v2_onnx_native.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/usr/bin/env bash

# Build inference_tract benchmark as a native shared library (Linux-only).
#
# Usage: ./build_mobile_net_v2_onnx_native.sh

(set -x;)
(rm -rf mobile_net_v2_onnx_native);
(cp -r mobile_net_v2_onnx mobile_net_v2_onnx_native);
(cp mobile_net_v2_onnx_native.patch mobile_net_v2_onnx_native);
(cd mobile_net_v2_onnx_native; patch -Np1 -i ./mobile_net_v2_onnx_native.patch; mv src/main.rs src/lib.rs; cd -);
(cd mobile_net_v2_onnx_native; cargo build --release; cp target/release/libbenchmark.so ../mobile_net_v2_onnx_benchmark.so; cd -);
(set +x;)
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/usr/bin/env bash

# Build inference_tract benchmark as native shared library (Linux-only).
#
# Usage: ./build_mobile_net_v2_tensorflow_native.sh

(set -x;)
(rm -rf mobile_net_v2_tensorflow_native);
(cp -r mobile_net_v2_tensorflow mobile_net_v2_tensorflow_native);
(cp mobile_net_v2_tensorflow_native.patch mobile_net_v2_tensorflow_native);
(cd mobile_net_v2_tensorflow_native; patch -Np1 -i ./mobile_net_v2_tensorflow_native.patch; mv src/main.rs src/lib.rs; cd -);
(cd mobile_net_v2_tensorflow_native; cargo build --release; cp target/release/libbenchmark.so ../mobile_net_v2_tensorflow_benchmark.so; cd -);
(set +x;)
Binary file added benchmarks/inference_tract/input.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 9dfac06

Please sign in to comment.