Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype: accumulate attrset updates, perform k-way merge #11290

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

roberth
Copy link
Member

@roberth roberth commented Aug 13, 2024

This seems to be slower as of yet. Main point was showing an implementation strategy for an optimization that applies also to other semigroups, like string concatenation and lists.
I didn't originally plan to pursue this, as mentioned, so if you're interested in algorithms and performance optimization you are more than welcome to work on this.

TODO

  • remove merge pass 2 into 1:
    • pass 1: fill UpdateQueue
    • pass 2: stats (like max output size)
    • pass 3: k-way merge
  • maybe tracking sortedness, if useful and cheap? in pass 1, building the UpdateQueue
  • make it actually fast; different algorithm depending on size distribution and number of update ops?
    • currently solution is the min-heap or priority queue approach
    • divide and conquer (see also Nixpkgs which manually implements this for pkgs/by-name) binary tree-shaped sorted merge operations
      • can we take the size of attrsets into account to balance the work? 1000 + (1 + (1 + 1)) is better than ((1000+1)+1)+1 (where number refers to attrset size)
    • iterative pairwise merging is more or less what we had, the tree of updates is flattened. Probably worse.
    • perhaps some combination of solutions, depending on a heuristic
  • split callFunction into two parts, effectively - call2Thunk (create the new scope's Env etc) - eval that thunk (evaluate the body)
  • implement ExprApply::evalForUpdate - easy after the split
  • make foldl' (//) work?
  • optimize listToAttrs and other primops?

Context

Yesterday's meeting (will post notes in a minute)
@tomberek

Priorities and Process

Add 👍 to pull requests you find important.

The Nix maintainer team uses a GitHub project board to schedule and track reviews.

This seems to be slower as of yet.

TODO

- [ ] remove merge pass 2 into 1:
  - pass 1: fill UpdateQueue
  - pass 2: stats (like max output size)
  - pass 3: k-way merge
- [ ] maybe tracking sortedness is useful and cheap? in pass 1, building the UpdateQueue
- [ ] make it actually fast; different algorithm depending on size distribution and number of update ops?
  - currently solution is the min-heap or priority queue approach
  - divide and conquer (see also Nixpkgs which manually implements this for pkgs/by-name)
    binary tree-shaped sorted merge operations
    - can we take the size of attrsets into account to balance the work?
      1000 + (1 + (1 + 1)) is better than ((1000+1)+1)+1  (where number refers to attrset size)
  - iterative pairwise merging is more or less what we had, the tree of updates is flattened. Probably worse.
  - perhaps some combination of solutions, depending on a heuristic
- [ ] split callFunction into two parts, effectively
        - call2Thunk (create the new scope's Env etc)
        - eval that thunk (evaluate the body)
- [ ] implement ExprApply::evalForUpdate - easy after the split
- [ ] make foldl' (//) work?
- [ ] optimize listToAttrs and other primops?
Comment on lines +1901 to +1902
e1->evalForUpdate(state, env, q);
e2->evalForUpdate(state, env, q);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
e1->evalForUpdate(state, env, q);
e2->evalForUpdate(state, env, q);
evalForUpdate(state, env, q);

Comment on lines +2043 to +2045
eval(state, env, vTmp);
// TODO add pos and errorCtx params
state.forceAttrs(vTmp, noPos, "while evaluating an attribute set merge operand");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

evalAttrs?

return pos;
}

virtual void evalForUpdate(EvalState & state, Env & env, UpdateQueue & q) override;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only new code in this declaration. The rest is just an expansion of the MakeBinOp macro.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should make a MakeBinOpMembers macro that MakeBinOp this can both uses?

@nixos-discourse
Copy link

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2024-08-12-nix-team-meeting-minutes-168/50561/1

@roberth roberth changed the title WIP: accumulate attrset updates, use k-way merge WIP: accumulate attrset updates, perform k-way merge Aug 13, 2024
@roberth roberth changed the title WIP: accumulate attrset updates, perform k-way merge Prototype: accumulate attrset updates, perform k-way merge Aug 14, 2024
@roberth roberth added language The Nix expression language; parser, interpreter, primops, evaluation, etc performance labels Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language The Nix expression language; parser, interpreter, primops, evaluation, etc performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants