Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A proof of concept to create standalone C library #3

Draft
wants to merge 41 commits into
base: main
Choose a base branch
from

Conversation

certik
Copy link

@certik certik commented Sep 29, 2023

This contains all kinds of hardwired details, but it creates a standalone .a library, that I then link with a C code in a clean manner. It creates a 9MB binary and it works:

$ ll -h driver
-rwxr-xr-x  1 ondrej  staff   9.9M Sep 28 18:34 driver
$ ./driver 
25
$ otool -L driver
driver:
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.0.0)
$ ll -h mylib/target/release/libmylib.a
-rw-r--r--  1 ondrej  staff    43M Sep 28 18:34 mylib/target/release/libmylib.a
$ time clang -o driver driver.o guest.o mylib/target/release/libmylib.a
clang -o driver driver.o guest.o mylib/target/release/libmylib.a  0.34s user 0.02s system 95% cpu 0.380 total

The linking benchmark at the very end takes 0.38s, which is not too bad, but it is still huge. Ideally this will link immediately, given how small example this is.

Timing of the binary

A typical timing for LFortran's LLVM generated binary is about 6ms:

$ time ./expr2
25
./expr2  0.00s user 0.00s system 49% cpu 0.006 total

While for the above binary (both the C and Rust generated) it's usually around 9ms:

$ time ./driver
25
./driver  0.00s user 0.00s system 51% cpu 0.009 total
$ time ./target/release/cwasm-standalone
25
./target/release/cwasm-standalone  0.00s user 0.00s system 59% cpu 0.009 total

That's usable.

Size of the binary

LFortran's LLVM is 33K, the C binary above is currently 10MB, the Rust generated one is 3MB:

$ ll -h expr2
-rwxr-xr-x  1 ondrej  staff    33K Sep 28 18:53 expr2
$ ll -h driver
-rwxr-xr-x  1 ondrej  staff   9.9M Sep 28 18:57 driver
$ ll -h ./target/release/cwasm-standalone
-rwxr-xr-x  1 ondrej  staff   3.3M Sep 28 18:58 ./target/release/cwasm-standalone

That's still 100x too much, and probably explains why it takes so long to link.

The size of the wasmtime-generated native binary .cwasm file is 66K, that is only 2x bigger than the full expr2 binary, which seems reasonable:

$ ll -h src/guest.cwasm 
-rw-r--r--  1 ondrej  staff    66K Sep 28 18:34 src/guest.cwasm

If I compare it to our wasm_x64 backend, we get the only 8.3K, which is about 8x smaller than .cwasm:

$ ll -h expr2 
-rw-r--r--  1 ondrej  staff   8.3K Sep 28 20:02 expr2
$ file expr2
expr2: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, no section header

Presumably it is thus possible to improve the compilation of .wasm into .cwasm to create 8x smaller binary.

The "driver" then adds 3 to 9 MB of extra stuff, but again presumably if we create our own driver it should be possible to get this extra overhead down to almost nothing.

@dicej
Copy link
Owner

dicej commented Sep 29, 2023

Thanks, @certik. I'll be away from my computer until Tuesday, but I'll be sure to take a look at this then.

Copy link
Owner

@dicej dicej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks reasonable to me.

I'm not sure why the clang/bin2c combination results in a much larger binary than rustc/include_bytes. Perhaps rustc does more aggressive link-time optimization by default in release mode? You might be able to do the same by passing e.g. -flto to clang at the appropriate places.

Alternatively, since your script assumes that clang and a native linker are available, perhaps it is reasonable to assume rustc is available and use that instead, given it's just as easy to install (in my experience)?

}

fn run_internal(guest: &[u8]) -> Result<()> {
let options = Options::parse();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably won't do anything useful, since the Rust std::env::args won't have been initialized when using this as a library from C. You could update run_rust_wasm_native_binary to accept CLI parameters and pass them to https://docs.rs/clap/latest/clap/trait.Parser.html#method.try_parse_from. Or you could just delete all the options parsing and WASI init code if you don't care about giving the guest access to host environment variables and directories.

Copy link
Author

@certik certik Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. Actually I do care about that, although I think we have not implemented that yet in our WASM backend. This poses an interesting problem --- in our LLVM backend we allow access to the whole file system. Here we have a nice option to actually restrict it, but it's not clear to me right now what the default should be.


[build-dependencies]
anyhow = "1.0.75"
reqwest = { version = "0.11.20", features = ["blocking"] }
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think reqwest is used anymore, so you could remove this.

@certik
Copy link
Author

certik commented Oct 3, 2023

@dicej thanks for the review. Given that even with Rust the binary is 100x bigger than what we generate with our direct WASM->x86 backend (we don't have an ARM backend yet) and given how slow both Rust as well as the C linker is compared to our own binary generation, I think we probably need to keep our own backend.

And we'll use this Rust approach as "reference implementation". We could keep both the Rust and Clang versions, I think both are useful as a reference, or we can just use Rust, which is a lot easier to handle, and as a reference implementation Rust is not a problem for sure. We probably should still add it to LFortran/LPython as an option (it is slower to compile and larger binary, but it does give you something that always works even on platforms that we do not currently support via WASM, like ARM, and also it gives you sandboxing, which is very nice and some users might want it).

I am very happy that we can now generate binaries using "official tools", and this cwasm-standalone is a very simple custom code that we can treat it as official. That way we can ensure our tooling stays compatible with the WASM tooling, and we can test it at our CI as well, and give users the option to use either one.

Would it make sense to add this feature into wasm-tools to add a command to take .wasm and generate a binary?

If so, that might be the best place to maintain this, and LFortran/LPython can just call it. Otherwise we'll need to figure out a way how LFortran could do it. I want a user interface like lfortran --backend=cwasm-standalone a.f90, and it would do all this automatically and spit out a binary.

@dicej
Copy link
Owner

dicej commented Oct 3, 2023

I agree that it would be useful to have a tool that converts a .wasm command to a native binary, although I suspect it would make more sense as part of wasmtime than as part of wasm-tools. @alexcrichton @cfallin @sunfishcode all had some interesting ideas on Zulip about how that might work. Perhaps it would be appropriate to reopen bytecodealliance/wasmtime#4563?

@certik
Copy link
Author

certik commented Oct 4, 2023

Looks like I already "liked" the issue bytecodealliance/wasmtime#4563. It looks like we investigated the standalone binary earlier this year in lcompilers/lpython#1461, but I am happy I asked at Zulip, thanks to you we now have a proof of concept that seems to work and I have a much better understanding.

Yes, wasmtime would be perfect for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants