Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval cache: fix cache regressions #11086

Merged
merged 2 commits into from
Jul 18, 2024
Merged

Conversation

kognise
Copy link
Contributor

@kognise kognise commented Jul 11, 2024

Motivation

Few lines of bugfixes for the eval cache.

Context

  1. Regression in Share evaluation caches across installables #10570: the eval cache is not being persisted in nix develop because evalCaches retains references to the caches and is never freed.

  2. No longer including in this PR: Regression in LockedFlake::getFingerprint(): Use Input::getFingerprint() #10149 (see also: PathInputScheme::getFingerprint(): Don't barf on relative paths #10176): the eval cache is not being used at all for commands such as nix develop path:. because getFingerprint returns null for relative paths. Before the regression, the /nix/... path was being included in the hash — I brought this back as a fallback.

  3. Nix can sometimes try to commit the eval cache sqlite transaction without there being an active transaction, which fails and prints errors that should be non-errors.

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

@kognise kognise requested a review from edolstra as a code owner July 11, 2024 23:46
@github-actions github-actions bot added the new-cli Relating to the "nix" command label Jul 11, 2024
@edwardwc
Copy link

edwardwc commented Jul 12, 2024

I too have noticed this issue, thank you so much for opening this PR!

auto fingerprint = flake.lockedRef.input.getFingerprint(store)
// Fingerprints are null for relative path references so we use
// the flake path as a fallback to avoid losing caching.
.value_or(flake.path.to_string());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is either safe by accident or unsafe, because this undocumented 💢 to_string() is calling methods that are for display and not guaranteed to be a functional key for the sources it represents.
However, I can imagine flake.path here to be a store path, in which case we could use it after making sure that it is one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's intentional because computing a fingerprint for path flakes requires reading the entire flake.

That seems to apply here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI this to_string() is exactly how fingerprints used to be calculated 😅

I will remove this part of the PR but it definitely feels like a footgun that relative path expression caching silently fails. I would like to make a separate PR that either:

  1. Prints a warning if nix develop, etc. is called with a relative path expression.
  2. Tries to resolves the path from CWD in the getFingerprint function.

Intuitively #2 sounds like less of a regression but I would appreciate guidance on which of the two options (or perhaps a different, third one) is the best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just about relative paths. We can't refer to anything mutable in the fingerprint, because this is the "always correct" kind of cache; not the kind whose entries expire.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some time spent learning about Nix's current eval caching behavior and looking at lazy trees later, I understand why my original suggestion doesn't make sense. Thanks for your patience. I also deleted these changes from the PR and it's ready for another look :)

I believe a much more advanced evaluation cache could use exclusively the things that have been read as keys, but that would be more like a research project. This would make it not just capable of caching path: inputs, but also for it to be effective at caching dev shells, which tends to suffer from small changes in unrelated parts of the source.

IMO a good caching system for local flakes is very important. Relatively small flakes can take 10+ seconds to evaluate on modern hardware and this is a big barrier to relying on flakes for dev shells. I would love to learn more about what existing solutions to this are. In the meantime I can use a Bash wrapper that naively caches nix print-dev-env outputs.

Copy link
Member

@edolstra edolstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the eval cache is not being used at all for commands such as nix develop path:.

That's intentional because computing a fingerprint for path flakes requires reading the entire flake. As part of (slowly) merging lazy trees, we're getting rid of that kind of behavior. The fact that we're currently still copying them to the Nix store should not be relied on, since that will go away.

The other parts of this PR look good.

@@ -697,6 +697,10 @@ struct CmdDevelop : Common, MixEnvironment
}
}

// Release our references to eval caches to ensure they are persisted to disk, because
// we are about to exec out of this process without running C++ destructors.
getEvalState()->evalCaches.clear();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should also do this in nix shell, nix run etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added to all commands.

i tried to come up with a good abstraction but couldn't come up with anything that didn't add a dependency from libcmd to nix, so i just went with the copypasta route. i also renamed runProgramInStore to execProgramInStore to better highlight the nature of the function.

@roberth
Copy link
Member

roberth commented Jul 12, 2024

That's intentional because computing a fingerprint for path flakes requires reading the entire flake.

I believe a much more advanced evaluation cache could use exclusively the things that have been read as keys, but that would be more like a research project. This would make it not just capable of caching path: inputs, but also for it to be effective at caching dev shells, which tends to suffer from small changes in unrelated parts of the source.

- Fix eval cache not being persisted in `nix develop` (since NixOS#10570)
- Don't attempt to commit cache transaction if there is no active transaction, which will spew errors in edge cases
- Drive-by: trivial typo fix
@kognise kognise requested review from edolstra and roberth July 12, 2024 22:43
@roberth roberth enabled auto-merge July 17, 2024 19:49
@roberth roberth merged commit 8ce4287 into NixOS:master Jul 18, 2024
11 checks passed
@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-07-17-nix-team-meeting-minutes-162/49255/1

@edolstra edolstra added the backport 2.23-maintenance Automatically creates a PR against the branch label Jul 26, 2024
Copy link

Successfully created backport PR for 2.23-maintenance:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.23-maintenance Automatically creates a PR against the branch new-cli Relating to the "nix" command
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants