Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"malformed mach-o: load commands size" with a multi-package cabal.project #5220

Open
phadej opened this issue Mar 19, 2018 · 18 comments
Open

Comments

@phadej
Copy link
Collaborator

phadej commented Mar 19, 2018

We have a project with over 50 packages we run into infamous OSX linker problem.

In otool -l dump of some dylib there are

Load command 21
          cmd LC_LOAD_DYLIB
      cmdsize 80
         name @rpath/libHSdpndnt-mp-0.2.4.0-b3458e24-ghc8.2.2.dylib (offset 24)
   time stamp 2 Thu Jan  1 02:00:02 1970
      current version 0.0.0

for each dependency, which is fine.

Local libraries names aren't mungled

Load command 45
          cmd LC_LOAD_DYLIB
      cmdsize 72
         name @rpath/libHSenv-config-0-inplace-ghc8.2.2.dylib (offset 24)
   time stamp 2 Thu Jan  1 02:00:02 1970
      current version 0.0.0
compatibility version 0.0.0

local package names aren't mungled: we have "long" names futurice-this and futurice-that, servant-algebraic-graphs ...

Each local library's dylib is in own dir

While there is one LC_RPATH for store:

Load command 330
          cmd LC_RPATH
      cmdsize 56
         path /Users/toku/.cabal/store/ghc-8.2.2/lib (offset 12)

There are one per local dependency:

Load command 320
          cmd LC_RPATH
      cmdsize 96
         path /Users/toku/hmr/dist-newstyle/build/x86_64-osx/ghc-8.2.2/log-cloudwatch-0/build (offset 12)
Load command 321
          cmd LC_RPATH
      cmdsize 88
         path /Users/toku/hmr/dist-newstyle/build/x86_64-osx/ghc-8.2.2/periocron-0/build (offset 12)
...

I'd say that fixing the second part, having single libdir in dist-newstyle (on OSX?) would save this problem too. For now it prevents splitting big "industrial size" repository into smaller packages (we want to have separate packages, as it's easier to manage).

We could use internal libraries to workaround this, but the problem would still persist for cases like https://github.com/phadej/acme-kmett (I didn't tried to compile it on OSX).

otool-hyperloglog.txt

The above dump is for hyperloglog .dylib from acme-kmett, it has 111 commands, small-dep app in our repo has 332, biggest (which overflows 32k limit) has 420.

cc @christiaanb

@angerman
Copy link
Collaborator

Lovely... so that's the boundary where the library-name munging hits its limits. So acme-kmett should be a good test case? I'll try building that. Let's see what happens.

@phadej
Copy link
Collaborator Author

phadej commented Mar 19, 2018

@angerman acme-kmett doesn't fail, it's just a repo with enough "deps" to highlight the problem.

Maybe if you pull all deps of yesod locally into single cabal.project, it will actually fail. I don't have a osx machine myself, so cannot try that.

@angerman
Copy link
Collaborator

Just learned so myself. Alright, I’ll try yesod.

I vaguely remember having the everything in the same lib idea implemented somewhere. But it would break package relocatability. I’m however more inclined to see if we can just link direct dependencies only in ghc and circumvent the limit that way; might still need to symlink(?) all libs into a common folder in cabal.

I’ll give

@cartazio
Copy link
Contributor

so the new-build version of the linker sadness of OSX has finally happened?

@cartazio
Copy link
Contributor

what about having a lib symlinks directory in the new-dist folder for new builds, then we can get really short relative paths, but they should be safe in the face of relocations

lets ignore the issue with lack of symlinks on window for this :))))

@angerman
Copy link
Collaborator

angerman commented Apr 1, 2018

Alright, i got a reproduction case:

$ brew install stack
$ stack new test-project yesod-sqlite
$ cd test-project && cabal new-build # this won't fail, but will populate the package database appropriately.
$ ghc-pkg --package-db ~/.cabal/store/ghc-8.4.1/package.db dot|tred > graph.dot
$ awk -F\  '{ print $1 }' graph.dot|grep \" >> pkgs
$ awk -F\  '{ print $3 }' graph.dot|grep \"|sed s/\"\;/\"/g >> pkgs

now pkgs will contain all packages from the package database.

$ echo "packages: ." > cabal.project
$ cat pkgs|sort -u|awk -F\" '{ print "          "$2 }' >> cabal.project

will create the cabal.project file containing all the libs. This might need a tiny bit of touch up...

And now...

$ cat cabal.project|grep -v "^packages"|xargs cabal unpack
$ cabal new-build

will likely error out with:

ghc: panic! (the 'impossible' happened)
  (GHC version 8.4.1 for x86_64-apple-darwin):
	Loading temp shared object failed: dlopen(/var/folders/fv/xqjrpfj516n5xq_m_ljpsjx00000gn/T/ghc53894_0/libghc_19.dylib, 5): no suitable image found.  Did find:
	/var/folders/fv/xqjrpfj516n5xq_m_ljpsjx00000gn/T/ghc53894_0/libghc_19.dylib: malformed mach-o: load commands size (34104) > 32768

Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

@angerman
Copy link
Collaborator

angerman commented Apr 1, 2018

@angerman
Copy link
Collaborator

angerman commented Apr 1, 2018

So I see the #4656 fixed the libraries in $HOME/.cabal/store only. Not in the the local dist-newbuild. That also explains why you need to specifically have them all in your project to trigger the issue.

@jship
Copy link

jship commented Apr 18, 2018

At work, we have worked around the mach-o "load commands size" issue by following in the Nix folks' footsteps and using a wrapper script around ld to recursively subdivide the dependencies into a tree of re-exporting delegate libraries.

The script, an example, and more info about this issue is available here:
https://github.com/Simspace/ld-wrapper-macos

@23Skidoo
Copy link
Member

23Skidoo commented Jun 8, 2018

@jship's suggestion is probably the best way forward. PRs welcome.

@angerman
Copy link
Collaborator

angerman commented Jun 9, 2018

Note that this is fixed in ghc8.6+ (https://phabricator.haskell.org/D4714) and backported in nixpkgs for at least 8.4 and 8.2.

This can also be fixed in the tooling I believe, as mafia did here: haskell-mafia/mafia#226

@angerman
Copy link
Collaborator

angerman commented Jun 9, 2018

Also note that once you start hitting the dylib issues you will soon hit the realgcc.exe: CreateProcess: No such file or directory when building on windows. Which is technically a gcc bug. See https://phabricator.haskell.org/D4762 for a rather crude hack around it.

Notably once you hit that on windows, you might hit it eventually on macOS and likely quite a bit later on Linux as well as your projects transitive dependency closure grows.

@23Skidoo
Copy link
Member

23Skidoo commented Jun 9, 2018

Mafia's fix looks quite simple, anyone willing to implement it in Cabal to fix the issue for GHC < 8.6?

@dfithian
Copy link

We have run into this issue using Cabal 3.4.0.0, GHC 8.10.4 malformed mach-o: load commands size (34376) > 32768 at work on our packages with the largest number of dependencies. We have https://github.com/Simspace/ld-wrapper-macos to workaround it in the short term. We are currently switching from Stack to Cabal. Stack does not have this issue.

I saw this comment in this PR: #7094 (comment)

We are wondering if this issue is still being tracked or planned to be worked on in GHC 9 and above.

@Mikolaj
Copy link
Member

Mikolaj commented Jun 25, 2021

Which issue?

#7339 is definitely being worked on.

@dfithian
Copy link

Wondering if #5220 (this one in the current release of Cabal/GHC) is known about or being worked on.

I'm not familiar enough with rpath to know if it would help, but it seems like path length is a factor in this bug, so that's why I asked.

@Mikolaj
Copy link
Member

Mikolaj commented Jun 25, 2021

I'm not familiar with the story, but reading the comments I see that #5220 is already closed for GHC >= 8.6. I don't think it's being worked on any more and I actually think we should close it. Am I misreading the comments?

@alt-romes
Copy link
Collaborator

alt-romes commented Nov 14, 2023

I was unable to reproduce this using @angerman's methodology in pandoc.
Can someone still reproduce this reliably ? @dfithian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants