Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: consider how to better support -split-sections #3467

Open
duog opened this issue Oct 3, 2017 · 18 comments
Open

Feature request: consider how to better support -split-sections #3467

duog opened this issue Oct 3, 2017 · 18 comments

Comments

@duog
Copy link

duog commented Oct 3, 2017

-split-sections is a ghc flag new in 8.2.1:
https://downloads.haskell.org/~ghc/8.2.1/docs/html/users_guide/phases.html?highlight=split-sections#ghc-flag--split-sections

Unfortunately it is a bit of a pain to use this with stack because it requires one to compile all of one's dependencies with -split-sections.

I have created a stack.yaml and build script for stack itself here:
https://github.com/duog/stack/tree/split-sections

which builds stack from an empty stack root both with and without -split-sections. If one examines the build.out file in that repository one observes that there is very little difference in build time, but the resulting executable is about half the size with -split-sections. It's not demonstrated here, but the .a library files are much bigger.

As I'm sure you're aware, it is a bit difficult to ensure all the libraries in one's snapshot have certain ghc-options, is there anything that can be done to make this easier?

@mgsloan
Copy link
Contributor

mgsloan commented Oct 4, 2017

This sounds like the sort of thing where there isn't much downside. May make sense to enable -split-sections by default.

As I'm sure you're aware, it is a bit difficult to ensure all the libraries in one's snapshot have certain ghc-options, is there anything that can be done to make this easier?

This is the thing that bugs me most about today's stack. Making this possible is actually quite an undertaking. #1265 describes a solution, and part of what's described there has been implemented. Though that issue is closed, we still hope to do something about this at some point - #3330 . It'd be great if someone wanted to take on this project.

@duog
Copy link
Author

duog commented Oct 4, 2017

I've no doubt that enabling -split-sections by default is the right long-term solution, but it's a new feature and you'd need a way to reliably turn it off.

As devil's advocate, the downsides I'm aware of are:

  • Not supported on windows in 8.2, and (currently) causes very slow linking in 8.4. See trac:12913, trac:13939
  • Seems to be the culprit in trac:14291, I believe this only affects statically linked ghcs
  • A bad default if you're producing dynamically linked executables, the .a archives are bloated for no benefit.

Though I've no doubt you've thought about it, it seems to me that the cabal-install new-* approach is the right one; a single package database with all the different flavours of packages. You could even have a per-snapshot database, though I don't see much upside to that. The current interface for specifying ghc-options in stack.yaml is great, it's just the interaction with already installed packages that causes me problems.

@mgsloan
Copy link
Contributor

mgsloan commented Oct 4, 2017

I haven't had much reason to use the cabal-install new-* stuff. If I understand it correctly, one consequence is that you can't really load dependencies into ghci without fully specifying which package ids ghci should use. With stack's approach, all of the packages that have been built for a particular snapshot / local db are available and consistent.

Once implemented, implicit snapshots will provide the benefits of both approaches. It may have a little bit of overhead in creating new package DBs, but that seems to be really quick. So, I think it is better to avoid the "everything in one DB" approach.

@duog
Copy link
Author

duog commented Oct 8, 2017

Yes that's a good point re ghci. I think I was assuming stack would provide me a sub-database with only the packages from my stack.yaml.

I admit that I don't really follow exactly what an implicit snapshot would be, but I suppose it's something close to the sub-database I was thinking of.

The primary benefit of a single DB is in maximizing sharing between projects, hopefully implicit snapshots will provide this.

@mgsloan
Copy link
Contributor

mgsloan commented Oct 8, 2017

The primary benefit of a single DB is in maximizing sharing between projects, hopefully implicit snapshots will provide this.

Yup, this can also be provided by multiple DBs. Stack already has package sharing between snapshot DBs. If some other snapshot DB already has the package (and it has the same dependencies), then it will just get registered in the other DB rather than rebuilt.

The thing that implicit snapshots adds to this is the possibility of also sharing extra-deps / possibly even git dependencies. More importantly in my mind, is that it would allow full deterministic control over the options used to build all dependencies.

@domenkozar
Copy link
Contributor

Seems to be default anyway since GHC 8.2.x on linux/darwin: https://ghc.haskell.org/trac/ghc/ticket/11445

@fosskers
Copy link
Contributor

fosskers commented Jul 5, 2018

Possibly relevant data. Binary sizes when compiling Aura with various resolvers "as-is":

  • lts-11.16 (ghc 8.2.2) -> 28.6mb
  • nightly-2018-07-04 (ghc 8.4.3) -> 25.7mb

Naively adding -split-sections to the ghc-options: section of my package.yaml doesn't seem to affect anything.

Using the strip tool over the executables also doesn't seem to have an affect.

@Berengal
Copy link

Berengal commented Jul 4, 2019

I fell into this rabbit-hole of executable-size optimization today, and gathered some more data.

I tested three stack projects:

  • hbfc, my own project
  • stack itself
  • aura, because it was tested previously.

On every project I ran stack build, checked the size of the executables in .stack-work/dist/..., ran strip on them and compared the size of the stripped executable to the one in .stack-work/install/... to make sure they matched (they did in all cases). I then added "$everything": -split-sections to the ghc-options: section in the stack-yaml file, adding the section if it wasn't there already, and repeated the same build and check.

The result:

executable -split-sections strip size
hbfc no no 20M
hbfc no yes 11M
hbfc yes no 6,6M
hbfc yes yes 3,6M
hbfi no no 3,3M
hbfi no yes 903K
hbfi yes no 3,3M
hbfi yes yes 903K
stack no no 99M
stack no yes 65M
stack yes no 51M
stack yes yes 33M
aura no no 44M
aura no yes 28M
aura yes no 14M
aura yes yes 8,4M

Adding -split-sections to everything seems to make a big difference. The one exception is the hbfi executable in hbfc. I assume the reason for that it only uses base and the hbfc library, that base is already built with -split-sections, and that the library in the same package either builds with -split-sections automatically, or GHC optimizes away the unused parts anyway.

@fosskers
Copy link
Contributor

fosskers commented Jul 4, 2019

Nice! See also: NixOS/nixpkgs#43795

@sluukkonen
Copy link

As a data point, enabling split sections reduced the size of the binary from 50MB to 14MB in one of my personal projects.

@fosskers
Copy link
Contributor

fosskers commented Jul 21, 2020

@Berengal I'm going to test that myself with Aura right now. If that's trivially possible, then we don't need to consider Nix at all.

@sluukkonen
Copy link

@fosskers Yes, I simply added

ghc-options:
  $everything: -split-sections

to my stack.yml.

@fosskers
Copy link
Contributor

@sluukkonen Thanks, I'm going to try that myself.

@fosskers
Copy link
Contributor

Yup, 8.8mb! I'm going to go ahead with this approach and make all my releases this way. Thank you!

@fosskers
Copy link
Contributor

How about a stack build --release flag that forces $everything: -split-sections?

@sluukkonen
Copy link

I'm not sure if that would be ideal.

Split sections requires all dependencies to built with it, so having a separate flag would mean that dependencies would be compiled twice.

Perhaps stack could switch them on by default at some point, but I'm not sure what the tradeoffs are. Compile times would be slower, at the very least.

@fosskers
Copy link
Contributor

That would be the point of --release. Rust's cargo has such a feature, and it does indeed recompile everything.

@dpwiz
Copy link

dpwiz commented Mar 16, 2021

Enabling global -split-sections like this breaks some packages with custom Setup (lens (fixed) and pretty-simple).

Windows builds of vulkan (and singletons (fixed)) are broken with "too many sections" error (works in cabal-install).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants