-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rustc: Enable -f{function,data}-sections #13833
Conversation
A comment by @thestinger on #13829
Can you elaborate? I was unaware of a performance cost from splitting into sections. Is there data to support this (benchmarks I can run to analyze what happens)? |
It's pointed out in the GCC documentation, although now I'm unsure if they mean code generation + linking is slower or the compiled binary: http://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Optimize-Options.html |
…nger The compiler has previously been producing binaries on the order of 1.8MB for hello world programs "fn main() {}". This is largely a result of the compilation model used by compiling entire libraries into a single object file and because static linking is favored by default. When linking, linkers will pull in the entire contents of an object file if any symbol from the object file is used. This means that if any symbol from a rust library is used, the entire library is pulled in unconditionally, regardless of whether the library is used or not. Traditional C/C++ projects do not normally encounter these large executable problems because their archives (rust's rlibs) are composed of many objects. Because of this, linkers can eliminate entire objects from being in the final executable. With rustc, however, the linker does not have the opportunity to leave out entire object files. In order to get similar benefits from dead code stripping at link time, this commit enables the -ffunction-sections and -fdata-sections flags in LLVM, as well as passing --gc-sections to the linker *by default*. This means that each function and each global will be placed into its own section, allowing the linker to GC all unused functions and data symbols. By enabling these flags, rust is able to generate much smaller binaries default. On linux, a hello world binary went from 1.8MB to 597K (a 67% reduction in size). The output size of dynamic libraries remained constant, but the output size of rlibs increased, as seen below: libarena - 2.27% bigger libcollections - 0.64% bigger libflate - 0.85% bigger libfourcc - 14.67% bigger libgetopts - 4.52% bigger libglob - 2.74% bigger libgreen - 9.68% bigger libhexfloat - 13.68% bigger liblibc - 10.79% bigger liblog - 10.95% bigger libnative - 8.34% bigger libnum - 2.31% bigger librand - 1.71% bigger libregex - 6.43% bigger librustc - 4.21% bigger librustdoc - 8.98% bigger librustuv - 4.11% bigger libsemver - 2.68% bigger libserialize - 1.92% bigger libstd - 3.59% bigger libsync - 3.96% bigger libsyntax - 4.96% bigger libterm - 13.96% bigger libtest - 6.03% bigger libtime - 2.86% bigger liburl - 6.59% bigger libuuid - 4.70% bigger libworkcache - 8.44% bigger This increase in size is a result of encoding many more section names into each object file (rlib). These increases are moderate enough that this change seems worthwhile to me, due to the drastic improvements seen in the final artifacts. The overall increase of the stage2 target folder (not the size of an install) went from 337MB to 348MB (3% increase). Additionally, linking is generally slower when executed with all these new sections plus the --gc-sections flag. The stage0 compiler takes 1.4s to link the `rustc` binary, where the stage1 compiler takes 1.9s to link the binary. Three megabytes are shaved off the binary. I found this increase in link time to be acceptable relative to the benefits of code size gained. This commit only enables --gc-sections for *executables*, not dynamic libraries. LLVM does all the heavy lifting when producing an object file for a dynamic library, so there is little else for the linker to do (remember that we only have one object file). I conducted similar experiments by putting a *module's* functions and data symbols into its own section (granularity moved to a module level instead of a function/static level). The size benefits of a hello world were seen to be on the order of 400K rather than 1.2MB. It seemed that enough benefit was gained using ffunction-sections that this route was less desirable, despite the lesser increases in binary rlib size.
@alexcrichton relevant previous discussion? #12140 |
@bharrisau, good point! I should have read over that before reopening this. I may have measured the size increase incorrectly last time, it's definitely much less drastic now than it was before. The link time for hello world does increase, on linux it jumps from 25ms to 75ms (I forgot to measure this earlier). It's still under 100ms, however, which is the main goal that I was shooting for before. |
The compiler has previously been producing binaries on the order of 1.8MB for hello world programs "fn main() {}". This is largely a result of the compilation model used by compiling entire libraries into a single object file and because static linking is favored by default. When linking, linkers will pull in the entire contents of an object file if any symbol from the object file is used. This means that if any symbol from a rust library is used, the entire library is pulled in unconditionally, regardless of whether the library is used or not. Traditional C/C++ projects do not normally encounter these large executable problems because their archives (rust's rlibs) are composed of many objects. Because of this, linkers can eliminate entire objects from being in the final executable. With rustc, however, the linker does not have the opportunity to leave out entire object files. In order to get similar benefits from dead code stripping at link time, this commit enables the -ffunction-sections and -fdata-sections flags in LLVM, as well as passing --gc-sections to the linker *by default*. This means that each function and each global will be placed into its own section, allowing the linker to GC all unused functions and data symbols. By enabling these flags, rust is able to generate much smaller binaries default. On linux, a hello world binary went from 1.8MB to 597K (a 67% reduction in size). The output size of dynamic libraries remained constant, but the output size of rlibs increased, as seen below: libarena - 2.27% bigger ( 292872 => 299508) libcollections - 0.64% bigger ( 6765884 => 6809076) libflate - 0.83% bigger ( 186516 => 188060) libfourcc - 14.71% bigger ( 307290 => 352498) libgetopts - 4.42% bigger ( 761468 => 795102) libglob - 2.73% bigger ( 899932 => 924542) libgreen - 9.63% bigger ( 1281718 => 1405124) libhexfloat - 13.88% bigger ( 333738 => 380060) liblibc - 10.79% bigger ( 551280 => 610736) liblog - 10.93% bigger ( 218208 => 242060) libnative - 8.26% bigger ( 1362096 => 1474658) libnum - 2.34% bigger ( 2583400 => 2643916) librand - 1.72% bigger ( 1608684 => 1636394) libregex - 6.50% bigger ( 1747768 => 1861398) librustc - 4.21% bigger (151820192 => 158218924) librustdoc - 8.96% bigger ( 13142604 => 14320544) librustuv - 4.13% bigger ( 4366896 => 4547304) libsemver - 2.66% bigger ( 396166 => 406686) libserialize - 1.91% bigger ( 6878396 => 7009822) libstd - 3.59% bigger ( 39485286 => 40902218) libsync - 3.95% bigger ( 1386390 => 1441204) libsyntax - 4.96% bigger ( 35757202 => 37530798) libterm - 13.99% bigger ( 924580 => 1053902) libtest - 6.04% bigger ( 2455720 => 2604092) libtime - 2.84% bigger ( 1075708 => 1106242) liburl - 6.53% bigger ( 590458 => 629004) libuuid - 4.63% bigger ( 326350 => 341466) libworkcache - 8.45% bigger ( 1230702 => 1334750) This increase in size is a result of encoding many more section names into each object file (rlib). These increases are moderate enough that this change seems worthwhile to me, due to the drastic improvements seen in the final artifacts. The overall increase of the stage2 target folder (not the size of an install) went from 337MB to 348MB (3% increase). Additionally, linking is generally slower when executed with all these new sections plus the --gc-sections flag. The stage0 compiler takes 1.4s to link the `rustc` binary, where the stage1 compiler takes 1.9s to link the binary. Three megabytes are shaved off the binary. I found this increase in link time to be acceptable relative to the benefits of code size gained. This commit only enables --gc-sections for *executables*, not dynamic libraries. LLVM does all the heavy lifting when producing an object file for a dynamic library, so there is little else for the linker to do (remember that we only have one object file). I conducted similar experiments by putting a *module's* functions and data symbols into its own section (granularity moved to a module level instead of a function/static level). The size benefits of a hello world were seen to be on the order of 400K rather than 1.2MB. It seemed that enough benefit was gained using ffunction-sections that this route was less desirable, despite the lesser increases in binary rlib size.
…nger The compiler has previously been producing binaries on the order of 1.8MB for hello world programs "fn main() {}". This is largely a result of the compilation model used by compiling entire libraries into a single object file and because static linking is favored by default. When linking, linkers will pull in the entire contents of an object file if any symbol from the object file is used. This means that if any symbol from a rust library is used, the entire library is pulled in unconditionally, regardless of whether the library is used or not. Traditional C/C++ projects do not normally encounter these large executable problems because their archives (rust's rlibs) are composed of many objects. Because of this, linkers can eliminate entire objects from being in the final executable. With rustc, however, the linker does not have the opportunity to leave out entire object files. In order to get similar benefits from dead code stripping at link time, this commit enables the -ffunction-sections and -fdata-sections flags in LLVM, as well as passing --gc-sections to the linker *by default*. This means that each function and each global will be placed into its own section, allowing the linker to GC all unused functions and data symbols. By enabling these flags, rust is able to generate much smaller binaries default. On linux, a hello world binary went from 1.8MB to 597K (a 67% reduction in size). The output size of dynamic libraries remained constant, but the output size of rlibs increased, as seen below: libarena - 2.27% bigger libcollections - 0.64% bigger libflate - 0.85% bigger libfourcc - 14.67% bigger libgetopts - 4.52% bigger libglob - 2.74% bigger libgreen - 9.68% bigger libhexfloat - 13.68% bigger liblibc - 10.79% bigger liblog - 10.95% bigger libnative - 8.34% bigger libnum - 2.31% bigger librand - 1.71% bigger libregex - 6.43% bigger librustc - 4.21% bigger librustdoc - 8.98% bigger librustuv - 4.11% bigger libsemver - 2.68% bigger libserialize - 1.92% bigger libstd - 3.59% bigger libsync - 3.96% bigger libsyntax - 4.96% bigger libterm - 13.96% bigger libtest - 6.03% bigger libtime - 2.86% bigger liburl - 6.59% bigger libuuid - 4.70% bigger libworkcache - 8.44% bigger This increase in size is a result of encoding many more section names into each object file (rlib). These increases are moderate enough that this change seems worthwhile to me, due to the drastic improvements seen in the final artifacts. The overall increase of the stage2 target folder (not the size of an install) went from 337MB to 348MB (3% increase). Additionally, linking is generally slower when executed with all these new sections plus the --gc-sections flag. The stage0 compiler takes 1.4s to link the `rustc` binary, where the stage1 compiler takes 1.9s to link the binary. Three megabytes are shaved off the binary. I found this increase in link time to be acceptable relative to the benefits of code size gained. This commit only enables --gc-sections for *executables*, not dynamic libraries. LLVM does all the heavy lifting when producing an object file for a dynamic library, so there is little else for the linker to do (remember that we only have one object file). I conducted similar experiments by putting a *module's* functions and data symbols into its own section (granularity moved to a module level instead of a function/static level). The size benefits of a hello world were seen to be on the order of 400K rather than 1.2MB. It seemed that enough benefit was gained using ffunction-sections that this route was less desirable, despite the lesser increases in binary rlib size.
The compiler has previously been producing binaries on the order of 1.8MB for
hello world programs "fn main() {}". This is largely a result of the compilation
model used by compiling entire libraries into a single object file and because
static linking is favored by default.
When linking, linkers will pull in the entire contents of an object file if any
symbol from the object file is used. This means that if any symbol from a rust
library is used, the entire library is pulled in unconditionally, regardless of
whether the library is used or not.
Traditional C/C++ projects do not normally encounter these large executable
problems because their archives (rust's rlibs) are composed of many objects.
Because of this, linkers can eliminate entire objects from being in the final
executable. With rustc, however, the linker does not have the opportunity to
leave out entire object files.
In order to get similar benefits from dead code stripping at link time, this
commit enables the -ffunction-sections and -fdata-sections flags in LLVM, as
well as passing --gc-sections to the linker by default. This means that each
function and each global will be placed into its own section, allowing the
linker to GC all unused functions and data symbols.
By enabling these flags, rust is able to generate much smaller binaries default.
On linux, a hello world binary went from 1.8MB to 597K (a 67% reduction in
size). The output size of dynamic libraries remained constant, but the output
size of rlibs increased, as seen below:
This increase in size is a result of encoding many more section names into each
object file (rlib). These increases are moderate enough that this change seems
worthwhile to me, due to the drastic improvements seen in the final artifacts.
The overall increase of the stage2 target folder (not the size of an install)
went from 337MB to 348MB (3% increase).
Additionally, linking is generally slower when executed with all these new
sections plus the --gc-sections flag. The stage0 compiler takes 1.4s to link the
rustc
binary, where the stage1 compiler takes 1.9s to link the binary. Threemegabytes are shaved off the binary. I found this increase in link time to be
acceptable relative to the benefits of code size gained.
This commit only enables --gc-sections for executables, not dynamic libraries.
LLVM does all the heavy lifting when producing an object file for a dynamic
library, so there is little else for the linker to do (remember that we only
have one object file).
I conducted similar experiments by putting a module's functions and data
symbols into its own section (granularity moved to a module level instead of a
function/static level). The size benefits of a hello world were seen to be on
the order of 400K rather than 1.2MB. It seemed that enough benefit was gained
using ffunction-sections that this route was less desirable, despite the lesser
increases in binary rlib size.