Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[haskell-updates] Packages fail with a linker error on darwin #152859

Closed
veprbl opened this issue Dec 31, 2021 · 10 comments
Closed

[haskell-updates] Packages fail with a linker error on darwin #152859

veprbl opened this issue Dec 31, 2021 · 10 comments
Labels
0.kind: bug Something is broken 6.topic: darwin Running or building packages on Darwin 6.topic: haskell

Comments

@veprbl
Copy link
Member

veprbl commented Dec 31, 2021

Several Haskell packages fail with

ld: in /nix/store/dm5wmj1k92jxs6w78548x648nn5gyfds-http-client-tls-0.3.5.3/lib/ghc-8.10.7/x86_64-osx-ghc-8.10.7/http-client-tls-0.3.5.3-3jOEYa4lgKaExeUduTGzPs/libHShttp-client-tls-0.3.5.3-3jOEYa4lgKaExeUduTGzPs.a(TLS.o), in section __TEXT,__text reloc 505: X86_64_RELOC_UNSIGNED following a X86_64_RELOC_SUBTRACTOR must have same r_length
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
`cc' failed in phase `Linker'. (Exit code: 1)

examples:

Bisecting points to a3b6030 as the first bad commit

Also see #152408 (comment)

@veprbl veprbl added 0.kind: bug Something is broken 6.topic: haskell 6.topic: darwin Running or building packages on Darwin labels Dec 31, 2021
@veprbl
Copy link
Member Author

veprbl commented Dec 31, 2021

The TLS.o file in the recent haskellPackages.http-client-tls can't be also read with objdump -r. The only difference between good and bad version is:

# nix-diff $(nix-store -q --deriver $(readlink good)) $(nix-store -q --deriver $(readlink bad))
- /nix/store/fi3ryfka2k040rs948qpprmwybhk0kzp-http-client-tls-0.3.5.3.drv:{out}
+ /nix/store/fn69csahirqjbgl0zk7d6b2sir72kxjc-http-client-tls-0.3.5.3.drv:{out}
• The input derivation named `http-client-0.6.4.1` differs
  - /nix/store/hwf8lzw2nw1rm0rmv0h4c8szv6czdqdc-http-client-0.6.4.1.drv:{out}
  + /nix/store/kai2akm955g9w96bjbd8xxa0794pmg5k-http-client-0.6.4.1.drv:{out}
  • The input derivation named `blaze-builder-0.4.2.2` differs
    - /nix/store/crjmsaz2251mrggi15i51mbl1gmkna06-blaze-builder-0.4.2.2.drv:{out}
    + /nix/store/ga7h5xbhgycsk1845744ry516qsrqnwl-blaze-builder-0.4.2.2.drv:{out}
    • The set of input derivation names do not match:
        + blaze-builder-0.4.2.2-r1.cabal
    • The environments do not match:
        prePatch=''
        echo "Replace Cabal file with edited version from mirror://hackage/blaze-builder-0.4.2.2/revision/1.cabal."
        cp /nix/store/7vh4ic8s5qlw8fg6yry1i0x0frz4h5a0-blaze-builder-0.4.2.2-r1.cabal blaze-builder.cabal
    ''
  • Skipping environment comparison
• Skipping environment comparison

doesn't correspond to anything meaningful:

--- /tmp/blaze-builder.cabal    2021-12-30 22:24:19.000000000 -0500
+++ /nix/store/7vh4ic8s5qlw8fg6yry1i0x0frz4h5a0-blaze-builder-0.4.2.2-r1.cabal  1969-12-31 19:00:01.000000000 -0500
@@ -1,5 +1,6 @@
 Name:                blaze-builder
 Version:             0.4.2.2
+x-revision: 1
 Synopsis:            Efficient buffered output.
 
 Description:
@@ -33,7 +34,7 @@
 Cabal-version:       >= 1.10
 
 Tested-with:
-  GHC == 9.2.0.20210821
+  GHC == 9.2.1
   GHC == 9.0.1
   GHC == 8.10.7
   GHC == 8.8.4
@@ -85,7 +86,7 @@
     , bytestring >= 0.9 && < 1.0
     , deepseq
     , ghc-prim
-    , text >= 0.10 && < 1.3
+    , text >= 0.10 && < 2.1
 
   if impl(ghc < 7.8)
      build-depends:  bytestring-builder

Looking at the object files I can't see anything special about either.

@veprbl
Copy link
Member Author

veprbl commented Dec 31, 2021

I've rebuilt the /nix/store/dm5wmj1k92jxs6w78548x648nn5gyfds-http-client-tls-0.3.5.3 locally and now it works. Must be some sort of corruption. Can somebody with Hydra rights, please, restart the build for https://hydra.nixos.org/job/nixpkgs/haskell-updates/haskellPackages.http-client-tls.x86_64-darwin ?

@veprbl veprbl added 6.topic: hydra.nixos.org Issues affecting the build cache at hydra.nixos.org and removed 0.kind: bug Something is broken labels Dec 31, 2021
@lockejan
Copy link
Contributor

lockejan commented Jan 2, 2022

Hm, maybe there is a more general bug, because the same goes for haskell-language-server on darwin.

[1 of 4] Compiling Ide.Plugin.Example ( plugins/default/src/Ide/Plugin/Example.hs, dist/build/haskell-language-server/haskell-language-server-tmp/Ide/Plugin/Example.o )
[2 of 4] Compiling Ide.Plugin.Example2 ( plugins/default/src/Ide/Plugin/Example2.hs, dist/build/haskell-language-server/haskell-language-server-tmp/Ide/Plugin/Example2.o )
[3 of 4] Compiling Plugins          ( exe/Plugins.hs, dist/build/haskell-language-server/haskell-language-server-tmp/Plugins.o )
[4 of 4] Compiling Main             ( exe/Main.hs, dist/build/haskell-language-server/haskell-language-server-tmp/Main.o )
Linking dist/build/haskell-language-server/haskell-language-server ...
ld: in /nix/store/hq1wbhqdry0rib3filcxa8x243pcpg0p-hls-splice-plugin-1.0.0.6/lib/ghc-8.8.4/x86_64-osx-ghc-8.8.4/hls-splice-plugin-1.0.0.6-AXEoOJEV5eK4sh5zVcb0Ss/libHShls-splice-plugin-1.0.0.6-AXEoOJEV5eK4sh5zVcb0Ss.a(Splice.o), malformed nlist string offset
clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
`cc' failed in phase `Linker'. (Exit code: 1)
error: builder for '/nix/store/l57k0n5i41yvxryjqpbhx153c35dsm9w-haskell-language-server-1.5.1.0.drv' failed with exit code 1;
       last 10 log lines:
       > Preprocessing executable 'haskell-language-server' for haskell-language-server-1.5.1.0..
       > Building executable 'haskell-language-server' for haskell-language-server-1.5.1.0..
       > [1 of 4] Compiling Ide.Plugin.Example ( plugins/default/src/Ide/Plugin/Example.hs, dist/build/haskell-language-server/haskell-language-server-tmp/Ide/Plugin/Example.o )
       > [2 of 4] Compiling Ide.Plugin.Example2 ( plugins/default/src/Ide/Plugin/Example2.hs, dist/build/haskell-language-server/haskell-language-server-tmp/Ide/Plugin/Example2.o )
       > [3 of 4] Compiling Plugins          ( exe/Plugins.hs, dist/build/haskell-language-server/haskell-language-server-tmp/Plugins.o )
       > [4 of 4] Compiling Main             ( exe/Main.hs, dist/build/haskell-language-server/haskell-language-server-tmp/Main.o )
       > Linking dist/build/haskell-language-server/haskell-language-server ...
       > ld: in /nix/store/hq1wbhqdry0rib3filcxa8x243pcpg0p-hls-splice-plugin-1.0.0.6/lib/ghc-8.8.4/x86_64-osx-ghc-8.8.4/hls-splice-plugin-1.0.0.6-AXEoOJEV5eK4sh5zVcb0Ss/libHShls-splice-plugin-1.0.0.6-AXEoOJEV5eK4sh5zVcb0Ss.a(Splice.o), malformed nlist string offset
       > clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
       > `cc' failed in phase `Linker'. (Exit code: 1)

@sternenseemann
Copy link
Member

This seems to check out, there was an error about an .hi file corruption on haskell-updates at some point which went away after restarting all failed builds, but it's not out of the question that some kind of corruption persisted in a succeeded build.

@sternenseemann
Copy link
Member

I've rebuilt the /nix/store/dm5wmj1k92jxs6w78548x648nn5gyfds-http-client-tls-0.3.5.3 locally and now it works. Must be some sort of corruption. Can somebody with Hydra rights, please, restart the build for https://hydra.nixos.org/job/nixpkgs/haskell-updates/haskellPackages.http-client-tls.x86_64-darwin ?

Restarting that job won't actually rebuild it, since the output path is already in the store / the binary cache… We either need to wait for a natural rebuild or inject some kind of change into the relevant drvs.

@grahamc
Copy link
Member

grahamc commented Jan 3, 2022

How can we identify the corruption and either fix or fail the build? Or, is there a bug in GHC? Let's explore these options instead of restarting successful builds.

@sternenseemann
Copy link
Member

I can muster neither the time nor energy to do so. Since the build was successful, everyone should be able to substitute the corrupted store path from cache.nixos.org.

I would also not rule out that this was an issue with builders, as there was other weird stuff happening like derivations failing in

echo "build input $pkg does not exist" >&2

@sternenseemann
Copy link
Member

sternenseemann commented Jan 3, 2022

I would also not rule out that this was an issue with builders, as there was other weird stuff happening like derivations failing in

E. g. https://hydra.nixos.org/build/162562802/nixlog/2:

build input N does not exist
builder for '/nix/store/rgjx3dwfnwdhjrvr44avv0fyrmivs9ja-git-brunch-1.5.1.0.drv' failed with exit code 1

I wonder if something went wrong with the filesystem backing the nix store?

@sternenseemann
Copy link
Member

@veprbl Thanks for looking into this, I was able to confirm your findings and cause a rebuild of the relevant derivations by changing their derivation hash.

@sternenseemann
Copy link
Member

Note that this issue is not really solved, we just worked around the worst instances. We are continuing to see new corruption-related problems in darwin builds.

@tomodachi94 tomodachi94 added 0.kind: bug Something is broken and removed 6.topic: hydra.nixos.org Issues affecting the build cache at hydra.nixos.org labels Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: bug Something is broken 6.topic: darwin Running or building packages on Darwin 6.topic: haskell
Projects
None yet
Development

No branches or pull requests

5 participants