-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A proof of concept to create standalone C library #3
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Joel Dice <[email protected]>
Signed-off-by: Joel Dice <[email protected]>
Thanks, @certik. I'll be away from my computer until Tuesday, but I'll be sure to take a look at this then. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me.
I'm not sure why the clang/bin2c combination results in a much larger binary than rustc/include_bytes. Perhaps rustc does more aggressive link-time optimization by default in release mode? You might be able to do the same by passing e.g. -flto
to clang at the appropriate places.
Alternatively, since your script assumes that clang and a native linker are available, perhaps it is reasonable to assume rustc is available and use that instead, given it's just as easy to install (in my experience)?
} | ||
|
||
fn run_internal(guest: &[u8]) -> Result<()> { | ||
let options = Options::parse(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably won't do anything useful, since the Rust std::env::args
won't have been initialized when using this as a library from C. You could update run_rust_wasm_native_binary
to accept CLI parameters and pass them to https://docs.rs/clap/latest/clap/trait.Parser.html#method.try_parse_from. Or you could just delete all the options parsing and WASI init code if you don't care about giving the guest access to host environment variables and directories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Actually I do care about that, although I think we have not implemented that yet in our WASM backend. This poses an interesting problem --- in our LLVM backend we allow access to the whole file system. Here we have a nice option to actually restrict it, but it's not clear to me right now what the default should be.
|
||
[build-dependencies] | ||
anyhow = "1.0.75" | ||
reqwest = { version = "0.11.20", features = ["blocking"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think reqwest
is used anymore, so you could remove this.
@dicej thanks for the review. Given that even with Rust the binary is 100x bigger than what we generate with our direct WASM->x86 backend (we don't have an ARM backend yet) and given how slow both Rust as well as the C linker is compared to our own binary generation, I think we probably need to keep our own backend. And we'll use this Rust approach as "reference implementation". We could keep both the Rust and Clang versions, I think both are useful as a reference, or we can just use Rust, which is a lot easier to handle, and as a reference implementation Rust is not a problem for sure. We probably should still add it to LFortran/LPython as an option (it is slower to compile and larger binary, but it does give you something that always works even on platforms that we do not currently support via WASM, like ARM, and also it gives you sandboxing, which is very nice and some users might want it). I am very happy that we can now generate binaries using "official tools", and this Would it make sense to add this feature into If so, that might be the best place to maintain this, and LFortran/LPython can just call it. Otherwise we'll need to figure out a way how LFortran could do it. I want a user interface like |
I agree that it would be useful to have a tool that converts a .wasm command to a native binary, although I suspect it would make more sense as part of |
Looks like I already "liked" the issue bytecodealliance/wasmtime#4563. It looks like we investigated the standalone binary earlier this year in lcompilers/lpython#1461, but I am happy I asked at Zulip, thanks to you we now have a proof of concept that seems to work and I have a much better understanding. Yes, wasmtime would be perfect for this. |
This contains all kinds of hardwired details, but it creates a standalone
.a
library, that I then link with a C code in a clean manner. It creates a 9MB binary and it works:The linking benchmark at the very end takes 0.38s, which is not too bad, but it is still huge. Ideally this will link immediately, given how small example this is.
Timing of the binary
A typical timing for LFortran's LLVM generated binary is about 6ms:
While for the above binary (both the C and Rust generated) it's usually around 9ms:
That's usable.
Size of the binary
LFortran's LLVM is 33K, the C binary above is currently 10MB, the Rust generated one is 3MB:
That's still 100x too much, and probably explains why it takes so long to link.
The size of the wasmtime-generated native binary
.cwasm
file is 66K, that is only 2x bigger than the fullexpr2
binary, which seems reasonable:If I compare it to our wasm_x64 backend, we get the only 8.3K, which is about 8x smaller than
.cwasm
:Presumably it is thus possible to improve the compilation of
.wasm
into.cwasm
to create 8x smaller binary.The "driver" then adds 3 to 9 MB of extra stuff, but again presumably if we create our own driver it should be possible to get this extra overhead down to almost nothing.