Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building for size #51

Open
stellaraccident opened this issue Nov 26, 2024 · 1 comment
Open

Building for size #51

stellaraccident opened this issue Nov 26, 2024 · 1 comment

Comments

@stellaraccident
Copy link

Adding tokenizers-cpp to our project made the binary size go up from a (stripped) baseline of 1.8MB to 8.4MB in a release build on Linux x86_64. This was just with the stable rust toolchain and all defaults.

I'm no rust expert but I applied a number of the options (that I could get to work) from this page https://github.com/johnthagen/min-sized-rust and was able to trivially get the binary size down to 4.2MB. Given that rust static links by default and there is a lot of data manipulation standard library code, this doesn't completely shock me (but it is still quite large compared to our baseline).

Recording here the things I quickly tried to achieve that:

  1. set(TOKENIZERS_CPP_RUST_FLAGS "-Zlocation-detail=none") in CMakeLists.txt (feature request: consider making these more configurable from the including project)
  2. Use the build-std approach listed above by adding this to the cargo command line in CMakeLists: -Z build-std=std,panic_abort -Z build-std-features="optimize_for_size" (and using a nightly toolchain)
  3. Add to Cargo.toml:
[profile.release]
lto = true
opt-level = "z"  # Optimize for size.
codegen-units = 1
panic = "abort"

I wasn't being super principaled, but iirc 1 and 2 combined shaved off ~500KB or so. LTO and build-std gave 2-3MB and the rest filled in the extra.

It would be good to be able to customize these things easily in the including project. Might call for some additional CMake goo and such.

@tqchen
Copy link
Contributor

tqchen commented Nov 26, 2024

Indeed trimming down size could be helpful, love to get a PR to add these options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants