Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lib.fileset.difference: init #259065

Merged
merged 1 commit into from
Nov 1, 2023
Merged

Conversation

infinisil
Copy link
Member

@infinisil infinisil commented Oct 4, 2023

Description of changes

This is another split off from the WIP file set combinators PR. This PR adds one function:

  • lib.fileset.difference: Subtracts the second file set from the first.
    # Recursively includes all files in ./. except ones under ./a/b
    lib.fileset.difference ./. ./a/b

This work is sponsored by Antithesis

Things done

  • Documentation
  • Tests

@github-actions github-actions bot added the 6.topic: lib The Nixpkgs function library label Oct 4, 2023
@infinisil infinisil mentioned this pull request Oct 4, 2023
7 tasks
@ofborg ofborg bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10 labels Oct 4, 2023
@infinisil infinisil marked this pull request as ready for review October 11, 2023 15:36
@infinisil infinisil marked this pull request as draft October 11, 2023 15:36
@infinisil infinisil marked this pull request as ready for review October 11, 2023 16:11
difference ./. ./tests

let
# A list of Nix files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# A list of Nix files
# A set of Nix-related files

nix/ may contain non-.nix files, which could be considered non-Nix.

fileset2._internalBaseComponents
);

# The filesetTree's need to have the same base path in order to be able to compute the difference between them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# The filesetTree's need to have the same base path in order to be able to compute the difference between them.
# The filesetTree's need to have the same base path in order to be able to compute the difference between them.
# Technically an unrelated root in the second, negative argument could be considered not to match any files in
# the first argument, but users would be clueless as to why their filter doesn't work.

Copy link
Member Author

@infinisil infinisil Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm not sure what you mean here. If the root of the second argument is unrelated (e.g. difference /foo /bar), then no files inside /foo match /bar, meaning nothing gets subtracted, so the result is /foo. And that should be expected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of entirely unrelated trees, such as when one is an absolute path, but the other is a fetchGit lazy tree.

Rejecting difference /foo /bar seems excessive, I agree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, it will error for different filesystem roots, there's a comment on this in the comments for _difference:

# The filesets must already be coerced and validated to be in the same filesystem root.

The public difference function passes its arguments through _coerceMany, which checks against this, before calling the internal one.

# Compute the set difference between two file sets.
# The filesets must already be coerced and validated to be in the same filesystem root.
# Type: Fileset -> Fileset -> Fileset
_difference = fileset1: fileset2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_difference = fileset1: fileset2:
_difference = positive: negative:
Suggested change
_difference = fileset1: fileset2:
_difference = baseSet: negatingSet:
Suggested change
_difference = fileset1: fileset2:
_difference = fileset: subtrahend:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I'll also apply this to the user-facing function. I like positive/negative the most

Comment on lines 661 to 662
# The common prefix is the same as the first path, so the second path is equal or longer.
# We need to shorten the second arguments tree to the same base path as the first one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear to me why.
What is tree2 for?

Copy link
Member Author

@infinisil infinisil Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as the tree of the negative file set, but with the same base path as the positive file set, such that they are comparable and can be operated on.

E.g. if you have difference /foo /foo/bar, the common path would be /foo, meaning the first branch applies.

It then takes the negative file set, (base: /foo/bar, tree: "directory") in this case, and shortens the base to /foo instead, such that you get (base: /foo, tree: { bar = "directory"; }). tree2 here only contains the tree part though, so { bar = "directory"; }.

Now we have two trees, "directory" from the positive file set, and { bar = "directory"; } from the negative file set, that have the same base /foo, so they can be operated on, and it matches the base path that the result should have.

I'll put something like this in the code docs, and I'll improve the variable name, negativeTreeWithPositiveBasePath or so (maybe shorter :P)

# The result is always the left hand side, because nothing gets removed
lhs
else
# Otherwise we always have two attribute sets which we can recurse into
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Otherwise we always have two attribute sets which we can recurse into
# Otherwise we always have two attribute sets to recurse into

[c]=1
)
checkFileset 'difference ./. ./a/x'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about removing the parent? e.g.

difference ./a ./.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the test could be more systematic, by framing it as a matrix of possible arguments. Perhaps not an actual matrix in terms of code, but laid out in a way that you can see that all combinations are tested.
That way you would naturally test difference ./a ./..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did that!

@infinisil infinisil force-pushed the fileset.difference branch 2 times, most recently from 2d4b2e4 to cb9a53d Compare October 25, 2023 20:22
@infinisil
Copy link
Member Author

@roberth Should be good now!

Comment on lines 410 to 417
{
context = "first argument";
value = positive;
}
{
context = "second argument";
value = negative;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{
context = "first argument";
value = positive;
}
{
context = "second argument";
value = negative;
}
{
context = "first argument (positive set)";
value = positive;
}
{
context = "second argument (negative set)";
value = negative;
}

Not 100% sure, but might be nice.

# because under the base path of `/foo/bar`, everything from the negative file set is included
# For `difference /foo /bar`, `negativeTreePositiveBase = null`
# because under the base path of `/foo`, nothing from the negative file set is included
negativeTreePositiveBase =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
negativeTreePositiveBase =
negativeTreeWithPositiveBase =

@infinisil
Copy link
Member Author

Done now, unless you do it before me, I'll merge once CI passes :)

@infinisil infinisil merged commit fc28c5e into NixOS:master Nov 1, 2023
6 checks passed
@infinisil infinisil deleted the fileset.difference branch November 1, 2023 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
6.topic: lib The Nixpkgs function library 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin 10.rebuild-linux: 1-10
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants