Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildEnv: better warning and error messages #134215

Merged
merged 1 commit into from
Aug 19, 2021
Merged

Conversation

ggPeti
Copy link
Member

@ggPeti ggPeti commented Aug 15, 2021

Motivation for this change

Detailed error messages for buildEnv failures. Also, warn the user about dangling symlinks instead of dying with a confusing error message. Warning about dangling symlinks is more inline with nix's buildenv.cc as well.
Obviates #82685.
Elucidates LnL7/nix-darwin#320.

Things done

@roberth
Copy link
Member

roberth commented Aug 15, 2021

Could you add a test in pkgs/build-support/buildenv and add it to pkgs/test/default.nix?

@ggPeti
Copy link
Member Author

ggPeti commented Aug 15, 2021

@roberth not the scope of this PR

@roberth
Copy link
Member

roberth commented Aug 15, 2021

Sure, if you don't mind it breaking in the future. Don't you already have an example? I'm not asking you to write a big test suite; just a test case.

@ggPeti
Copy link
Member Author

ggPeti commented Aug 15, 2021

The change is a strict relaxation of builds, this can be logically seen from the refactored conditional expressions. Example: currently the zulu16 package contains a dangling symlink, share/man. Whenever you put zulu16 in a buildEnv, it errors with this confusing error message:

> collision between `/nix/store/66zbyxkmj7bw7cd47r5p81hjyy0vhxc6-zulu16.30.15-ca-jdk-16.0.1/share/man' and `'

After the change, you get a warning instead:

> skipping dangling symlink `/nix/store/66zbyxkmj7bw7cd47r5p81hjyy0vhxc6-zulu16.30.15-ca-jdk-16.0.1/share/man' -> `zulu-16.jdk/Contents/Home/man'

@ggPeti
Copy link
Member Author

ggPeti commented Aug 15, 2021

What does everyone think about the relaxation of dangling symlinks? It could be argued that in some cases it defers failure, therefore breaks a fail-early expectation. But I am guessing that it fixes more failing builds than it lengthens. In any case, I can be convinced of the opposite. But the added clarity of error messages is dearly needed.

@roberth
Copy link
Member

roberth commented Aug 15, 2021

Right, I assumed you had a valid use case. Just improving the error message seems like a better solution if you were able to fix your input.

@symphorien
Copy link
Member

not the scope of this PR

I wrote a test for you, and it's failing:

diff --git a/pkgs/build-support/buildenv/test.nix b/pkgs/build-support/buildenv/test.nix
new file mode 100644
index 00000000000..ea445fe7278
--- /dev/null
+++ b/pkgs/build-support/buildenv/test.nix
@@ -0,0 +1,18 @@
+{ buildEnv, runCommand }: let
+  danglingSymlink = runCommand "danglingSymlink" {} ''
+    mkdir $out
+    ln -s /does/not/exist $out/bin
+  '';
+  collidingDir = runCommand "collidingDir" {} ''
+    mkdir -p $out/bin
+    echo foo > $out/bin/foo
+  '';
+  result = buildEnv {
+    name = "regression-for-issue-82685";
+    paths = [ danglingSymlink collidingDir ];
+  };
+in
+  runCommand "check" {} ''
+    [ "$(cat ${result}/bin/foo)" = foo ]
+  ''
+
diff --git a/pkgs/test/default.nix b/pkgs/test/default.nix
index ebf732839ce..5189ff8e1fb 100644
--- a/pkgs/test/default.nix
+++ b/pkgs/test/default.nix
@@ -55,4 +55,6 @@ with pkgs;
   trivial-overriding = callPackage ../build-support/trivial-builders/test-overriding.nix {};
 
   writers = callPackage ../build-support/writers/test.nix {};
+
+  buildEnv = callPackage ../build-support/buildenv/test.nix {};
 }

Ouput:

$  nix-build . -A tests.buildEnv 
these derivations will be built:
  /nix/store/0nx0c906m3rbnmz751n30v470s6kxxss-builder.pl.drv
  /nix/store/2wvhvqsyc2vws0nibyb7yb6iaign7w8m-danglingSymlink.drv
  /nix/store/fb9mivym0fbr83ysncfylm8y3mbwadp9-collidingDir.drv
  /nix/store/43m12l9fz6mvdpfwg7gn3xc62r8ndffp-regression-for-issue-82685.drv
  /nix/store/ylkdjx2yi7hrd85lzmngj4nzrbb84i61-check.drv
building '/nix/store/0nx0c906m3rbnmz751n30v470s6kxxss-builder.pl.drv'...
building '/nix/store/fb9mivym0fbr83ysncfylm8y3mbwadp9-collidingDir.drv'...
building '/nix/store/2wvhvqsyc2vws0nibyb7yb6iaign7w8m-danglingSymlink.drv'...
building '/nix/store/43m12l9fz6mvdpfwg7gn3xc62r8ndffp-regression-for-issue-82685.drv'...
Use of uninitialized value in string eq at /nix/store/4j9jcksbmi1gxyxz98w7gziwvgj3fjdl-builder.pl line 133.
Use of uninitialized value $stat1 in numeric ne (!=) at /nix/store/4j9jcksbmi1gxyxz98w7gziwvgj3fjdl-builder.pl line 94.
Use of uninitialized value $stat1 in bitwise and (&) at /nix/store/4j9jcksbmi1gxyxz98w7gziwvgj3fjdl-builder.pl line 95.
different permissions in `/nix/store/v1ynhg68mnq7ki775ayzcjr4ww9fv461-danglingSymlink/bin' and `/nix/store/785a8sscyv7i42mhrp6yidkin86nvg12-collidingDir/bin': 0000 <-> 0555 at /nix/store/4j9jcksbmi1gxyxz98w7gziwvgj3fjdl-builder.pl line 95.
collision between `/nix/store/785a8sscyv7i42mhrp6yidkin86nvg12-collidingDir/bin' and `/nix/store/v1ynhg68mnq7ki775ayzcjr4ww9fv461-danglingSymlink/bin'
builder for '/nix/store/43m12l9fz6mvdpfwg7gn3xc62r8ndffp-regression-for-issue-82685.drv' failed with exit code 2
cannot build derivation '/nix/store/ylkdjx2yi7hrd85lzmngj4nzrbb84i61-check.drv': 1 dependencies couldn't be built
error: build of '/nix/store/ylkdjx2yi7hrd85lzmngj4nzrbb84i61-check.drv' failed

It really looks like #82685 so I suspect something is off.

@ofborg ofborg bot added the 8.has: package (new) This PR adds a new package label Aug 16, 2021
@ggPeti
Copy link
Member Author

ggPeti commented Aug 16, 2021

Thanks for the test @symphorien, I included it in the PR. The error was that the dangling symlink check only happened if there was a colliding file before it in the env. I fixed it so that the check and skip happens first.

This results in two changes to previous buildEnv behavior:

  1. Dangling symlinks won't get created in the built env, even if they don't collide with anything. A warning will be emitted though. This has a remote chance of breaking some builds, but I find it hard to imagine that there should be builds that rely on the existence of broken symlinks.
  2. Collision with dangling symlinks won't result in a build failure. This is guaranteed to not break any builds, is proven to fix some builds, but might result in some breakin builds taking a longer time.

Are these acceptable changes? Are there any other concerns regarding this PR?

@roberth
Copy link
Member

roberth commented Aug 16, 2021

  1. but I find it hard to imagine that there should be builds that rely on the existence of broken symlinks.

@infinisil wrote this though NixOS/nix#4790, which is for the builtin builder used by nix-env.

A symlink can become valid on a system when it has an absolute path. I am not aware of an instance of this.

@ggPeti
Copy link
Member Author

ggPeti commented Aug 16, 2021

@roberth that's indeed relevant here. The unpleasant thing with symlinks that dangle during merging is that they cannot be merged like directories. I think the greatest common divisior in this case is creating dangling symlinks with a warning, and failing on collision with a clear error message.

@infinisil
Copy link
Member

I think dangling symlinks should be handled the same as in normal derivations, which currently allow them just fine:

pkgs.runCommand "etc" {} ''
  ln -s /etc $out
''

And I think they should be allowed, because:

  • If a user wants to create a dangling symlink we should let them. It's up to the user at runtime to make sure it exists and take care of merging it if required. A dangling symlink is kind of the boundary where Nix gives control to the runtime for a specific path
  • Symlinks like that are only relevant at runtime, which is commonly outside of Nix's scope. There's nothing in Nix itself that prevents users from using impure library paths, or impure shebang paths, etc. which are very similar in nature to a dangling symlink

@infinisil
Copy link
Member

I'm all in favor of a warning for dangling symlinks though

@happysalada
Copy link
Contributor

This PR looks good to me.

@ggPeti
Copy link
Member Author

ggPeti commented Aug 16, 2021

New warning format:

warning: creating dangling symlink `/nix/store/ii8ny2vhsixkvliicklgfjkg6carbbya-dangling-first/bin' -> `/nix/store/38znrrrhivmnrl8s2hcx45xw6ynchmf2-danglingSymlink/bin' -> `/does/not/exist'
error: collision between `/nix/store/kqcpq6s6bqi82vkg08sgdhfjs9pyzwjz-collidingDir/bin' and dangling symlink `/nix/store/38znrrrhivmnrl8s2hcx45xw6ynchmf2-danglingSymlink/bin'

I have trouble with including tests for these changes, because what needs to be tested is how the build fails, which in turn triggers an evaluation failure. Could try running nix in the sandbox, but meh. So for now scratch the tests. I've tested the following scenarios manually:

  • buildEnv with dangling symlink first and colliding directory second
  • buildEnv with colliding directory first and dangling symlink second
  • buildEnv with some pathsToLink set to dangling symlink
  • buildEnv with some pathsToLink set to non-directory files
  • buildEnv without pathsToLink, top level not a directory

and all of them correctly fail with understandable error messages.
Ready to merge from my side, but open to further discussion if needed.

@ggPeti ggPeti changed the title buildEnv: don't die on invalid symlinks buildEnv: better warning and error messages Aug 16, 2021
@roberth
Copy link
Member

roberth commented Aug 16, 2021

You might be able to pull it off by pulling the .drvs into a nixosTest, but I can't guarantee that it will work and it involves the builtins.unsafe* functions. Evaluating in a test proved problematic for nixpkgs-review and ofborg when I wrote one for trivial-builders.
So unless you feel adventurous, it's ok to skip the failure cases.

Copy link
Member

@symphorien symphorien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but I don't really know perl.

@ggPeti ggPeti changed the base branch from master to staging August 16, 2021 21:07
@brainrake brainrake self-requested a review August 17, 2021 15:21
Copy link
Contributor

@brainrake brainrake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

@roberth roberth merged commit dac87fa into NixOS:staging Aug 19, 2021
@Mindavi
Copy link
Contributor

Mindavi commented Aug 19, 2021

I'm getting an error on my hydra instance (which builds staging), and I think it might be related to this:

 Below is the build log of derivation /nix/store/msyl133agbnr344j10kkm8jgvnkkiw7r-python3-3.9.6-env.drv. It was built on localhost.

Global symbol "$extraPrefix" requires explicit package name (did you forget to declare "my $extraPrefix"?) at /nix/store/bs52yich1glfdylis8pivsl5y0zzfjpp-builder.pl line 143.
Execution of /nix/store/bs52yich1glfdylis8pivsl5y0zzfjpp-builder.pl aborted due to compilation errors.

The $extraPrefix seems to be declared further down the file. I don't know much about perl, but I could imagine that being an issue here.

Reproduce with nix-build -A systemdMinimal (or something else that has python3 with packages in its closure, e.g. nss).

@vcunat
Copy link
Member

vcunat commented Aug 20, 2021

Yes, this error broke really lots of things. It will be reverted if not fixed quickly. Example failures: https://hydra.nixos.org/eval/1697428

@Mindavi
Copy link
Contributor

Mindavi commented Aug 20, 2021

It's probably as simple as moving the my $extraPrefix up, but I'm not sure how correct that is. It seems to be working locally for me though (at least fixes the build).

@ggPeti
Copy link
Member Author

ggPeti commented Aug 20, 2021

I added the extraPrefix part after the reviews, that was a mistake. Thanks for pointing the error out @Mindavi. This needs to be fixed in another PR.

ggPeti added a commit to ggPeti/nixpkgs that referenced this pull request Aug 20, 2021
@ggPeti ggPeti deleted the fix-buildenv branch August 20, 2021 12:56
roberth added a commit that referenced this pull request Aug 20, 2021
buildenv: fix regression introduced by #134215
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/trouble-sleeping-problem-with-sleep-command/19101/7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants