A proof of concept to create standalone C library #3

certik · 2023-09-29T00:38:47Z

This contains all kinds of hardwired details, but it creates a standalone .a library, that I then link with a C code in a clean manner. It creates a 9MB binary and it works:

$ ll -h driver
-rwxr-xr-x  1 ondrej  staff   9.9M Sep 28 18:34 driver
$ ./driver 
25
$ otool -L driver
driver:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)
$ ll -h mylib/target/release/libmylib.a
-rw-r--r--  1 ondrej  staff    43M Sep 28 18:34 mylib/target/release/libmylib.a
$ time clang -o driver driver.o guest.o mylib/target/release/libmylib.a
clang -o driver driver.o guest.o mylib/target/release/libmylib.a  0.34s user 0.02s system 95% cpu 0.380 total

The linking benchmark at the very end takes 0.38s, which is not too bad, but it is still huge. Ideally this will link immediately, given how small example this is.

Timing of the binary

A typical timing for LFortran's LLVM generated binary is about 6ms:

$ time ./expr2
25
./expr2  0.00s user 0.00s system 49% cpu 0.006 total

While for the above binary (both the C and Rust generated) it's usually around 9ms:

$ time ./driver
25
./driver  0.00s user 0.00s system 51% cpu 0.009 total
$ time ./target/release/cwasm-standalone
25
./target/release/cwasm-standalone  0.00s user 0.00s system 59% cpu 0.009 total

That's usable.

Size of the binary

LFortran's LLVM is 33K, the C binary above is currently 10MB, the Rust generated one is 3MB:

$ ll -h expr2
-rwxr-xr-x  1 ondrej  staff    33K Sep 28 18:53 expr2
$ ll -h driver
-rwxr-xr-x  1 ondrej  staff   9.9M Sep 28 18:57 driver
$ ll -h ./target/release/cwasm-standalone
-rwxr-xr-x  1 ondrej  staff   3.3M Sep 28 18:58 ./target/release/cwasm-standalone

That's still 100x too much, and probably explains why it takes so long to link.

The size of the wasmtime-generated native binary .cwasm file is 66K, that is only 2x bigger than the full expr2 binary, which seems reasonable:

$ ll -h src/guest.cwasm 
-rw-r--r--  1 ondrej  staff    66K Sep 28 18:34 src/guest.cwasm

If I compare it to our wasm_x64 backend, we get the only 8.3K, which is about 8x smaller than .cwasm:

$ ll -h expr2 
-rw-r--r--  1 ondrej  staff   8.3K Sep 28 20:02 expr2
$ file expr2
expr2: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, no section header

Presumably it is thus possible to improve the compilation of .wasm into .cwasm to create 8x smaller binary.

The "driver" then adds 3 to 9 MB of extra stuff, but again presumably if we create our own driver it should be possible to get this extra overhead down to almost nothing.

Signed-off-by: Joel Dice <[email protected]>

dicej · 2023-09-29T01:17:45Z

Thanks, @certik. I'll be away from my computer until Tuesday, but I'll be sure to take a look at this then.

dicej

This looks reasonable to me.

I'm not sure why the clang/bin2c combination results in a much larger binary than rustc/include_bytes. Perhaps rustc does more aggressive link-time optimization by default in release mode? You might be able to do the same by passing e.g. -flto to clang at the appropriate places.

Alternatively, since your script assumes that clang and a native linker are available, perhaps it is reasonable to assume rustc is available and use that instead, given it's just as easy to install (in my experience)?

dicej · 2023-10-03T18:13:29Z

mylib/src/lib.rs

+}
+
+fn run_internal(guest: &[u8]) -> Result<()> {
+    let options = Options::parse();


This probably won't do anything useful, since the Rust std::env::args won't have been initialized when using this as a library from C. You could update run_rust_wasm_native_binary to accept CLI parameters and pass them to https://docs.rs/clap/latest/clap/trait.Parser.html#method.try_parse_from. Or you could just delete all the options parsing and WASI init code if you don't care about giving the guest access to host environment variables and directories.

Ah I see. Actually I do care about that, although I think we have not implemented that yet in our WASM backend. This poses an interesting problem --- in our LLVM backend we allow access to the whole file system. Here we have a nice option to actually restrict it, but it's not clear to me right now what the default should be.

dicej · 2023-10-03T18:14:07Z

mylib/Cargo.toml

+
+[build-dependencies]
+anyhow = "1.0.75"
+reqwest = { version = "0.11.20", features = ["blocking"] }


I don't think reqwest is used anymore, so you could remove this.

certik · 2023-10-03T20:06:29Z

@dicej thanks for the review. Given that even with Rust the binary is 100x bigger than what we generate with our direct WASM->x86 backend (we don't have an ARM backend yet) and given how slow both Rust as well as the C linker is compared to our own binary generation, I think we probably need to keep our own backend.

And we'll use this Rust approach as "reference implementation". We could keep both the Rust and Clang versions, I think both are useful as a reference, or we can just use Rust, which is a lot easier to handle, and as a reference implementation Rust is not a problem for sure. We probably should still add it to LFortran/LPython as an option (it is slower to compile and larger binary, but it does give you something that always works even on platforms that we do not currently support via WASM, like ARM, and also it gives you sandboxing, which is very nice and some users might want it).

I am very happy that we can now generate binaries using "official tools", and this cwasm-standalone is a very simple custom code that we can treat it as official. That way we can ensure our tooling stays compatible with the WASM tooling, and we can test it at our CI as well, and give users the option to use either one.

Would it make sense to add this feature into wasm-tools to add a command to take .wasm and generate a binary?

If so, that might be the best place to maintain this, and LFortran/LPython can just call it. Otherwise we'll need to figure out a way how LFortran could do it. I want a user interface like lfortran --backend=cwasm-standalone a.f90, and it would do all this automatically and spit out a binary.

dicej · 2023-10-03T22:34:32Z

I agree that it would be useful to have a tool that converts a .wasm command to a native binary, although I suspect it would make more sense as part of wasmtime than as part of wasm-tools. @alexcrichton @cfallin @sunfishcode all had some interesting ideas on Zulip about how that might work. Perhaps it would be appropriate to reopen bytecodealliance/wasmtime#4563?

certik · 2023-10-04T03:28:49Z

Looks like I already "liked" the issue bytecodealliance/wasmtime#4563. It looks like we investigated the standalone binary earlier this year in lcompilers/lpython#1461, but I am happy I asked at Zulip, thanks to you we now have a proof of concept that seems to work and I have a much better understanding.

Yes, wasmtime would be perfect for this.

dicej and others added 30 commits September 27, 2023 16:26

use WASI Preview 1 module instead of WASI Preview 2 component

59c1481

Signed-off-by: Joel Dice <[email protected]>

ignore zero exit code

cedbc1b

Signed-off-by: Joel Dice <[email protected]>

Modify path

477bdee

Remove custom build step

326b37f

Add a build script

2d1a713

Work on exposing it as a pointer

e3c66d0

Use Vec<u8>

9040a75

Refactor

dcb463d

Borrow

6f3b5ec

Rearrange

0944730

Export via C

62ea06c

Call the C function

ce4e659

Simplify

b9c2a38

Simplify

c5e3981

Rework

4864ab4

Add mylib

1add678

Use it

629bd92

X

d188e50

Runs

4a7b54d

OK

183a3c1

ok

9236862

This works, but only sometimes...

bd6b3dc

Now it works

292a4a2

X

802cd88

works

67f1534

X

67e6f9e

offline

7301089

remove lock

4a0bf3b

X

592a58e

X

bb145de

certik added 11 commits September 28, 2023 18:07

X

432fff6

X

2de6e03

X

ad27e87

X

fad9323

X?

6ddaf4f

rename

1815454

compile

c5c7aea

X

0aae4da

X

24b9d9c

X

6e335be

release

72a772d

certik mentioned this pull request Sep 30, 2023

WASM Roadmap lfortran/lfortran#2542

Open

dicej reviewed Oct 3, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A proof of concept to create standalone C library #3

A proof of concept to create standalone C library #3

certik commented Sep 29, 2023 •

edited

Loading

dicej commented Sep 29, 2023

dicej left a comment

dicej Oct 3, 2023

certik Oct 3, 2023 •

edited

Loading

dicej Oct 3, 2023

certik commented Oct 3, 2023 •

edited

Loading

dicej commented Oct 3, 2023

certik commented Oct 4, 2023

A proof of concept to create standalone C library #3

Are you sure you want to change the base?

A proof of concept to create standalone C library #3

Conversation

certik commented Sep 29, 2023 • edited Loading

Timing of the binary

Size of the binary

dicej commented Sep 29, 2023

dicej left a comment

Choose a reason for hiding this comment

dicej Oct 3, 2023

Choose a reason for hiding this comment

certik Oct 3, 2023 • edited Loading

Choose a reason for hiding this comment

dicej Oct 3, 2023

Choose a reason for hiding this comment

certik commented Oct 3, 2023 • edited Loading

dicej commented Oct 3, 2023

certik commented Oct 4, 2023

certik commented Sep 29, 2023 •

edited

Loading

certik Oct 3, 2023 •

edited

Loading

certik commented Oct 3, 2023 •

edited

Loading