-
-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make parent dir available in fileset.fileFilter #306371
Make parent dir available in fileset.fileFilter #306371
Conversation
Currently, fileFilter only allows filtering based based on a files base name + file type. This is a bit limiting if you want to include files based on the name of the directory they reside in. I bumped in to this when using fileFilter to make a minimal set of files to make Cargo happy in a Rust workspace - specifically, I needed to pull in any `.rs` file that lived under a `bin/`, `examples/`, or `benches/` folder before Cargo would stop complaining. (Note that I am trying to avoid pulling in all rust files - I'm working in a large Rust workspace but trying to package a single small member of the workspace). ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Unfortunately I already explored this option in #269504, where we figured that this shouldn't be done like that because it leads to poorer performance than expected. In your example, it would have to recurse throughout all bin
directories, even though they won't be included in the result.
A more general tracking issue for this use case is #271307, for which I think this is the best way forward: #271307 (comment)
Re: performance, it seems like passing the path unaltered (as I do in this PR) is close to zero cost for any use cases that don't do anything with
I'm a little confused... doesn't fileFilter already recurse through all directories? Since it passes every file to the predicate? |
After reading #269504 a little more closely you gave this example:
I see how that is very wasteful, but that is a very different situation that my example - I am already in a situation where it makes sense to use fileFilter (thus paying the performance penalty). This is the full example of how I'm using this: fileSetForCrate = fileset.toSource {
root = ../..;
fileset = fileset.fileFilter (file:
file.name == "Cargo.toml"
|| file.name == "Cargo.lock"
|| file.name == "main.rs"
|| file.name == "lib.rs"
|| file.name == "build.rs"
# Include bin/*.rs, examples/*.rs, and benches/*.rs to make Cargo happy
|| (file.hasExt "rs" && (pkgs.lib.hasSuffix "bin" (toString file.dir)))
|| (file.hasExt "rs" && (pkgs.lib.hasSuffix "examples" (toString file.dir)))
|| (file.hasExt "rs" && (pkgs.lib.hasSuffix "benches" (toString file.dir)))
) ../..;
}; |
I think I agree that your preferred solution (#271307 (comment)) is a better fit. To be clear, for my needs it will still have the same performance since I am already in a situation where I need to recurse through all files, but that API seems harder to accidentally misuse. |
Performance is a bit more tricky because the fileset combinators are lazy. So e.g. while
The same will happen with the proposed |
This would be problematic because it would allow filesets to depend on the absolute location of the local directory. It's a very intentional design choice to make sure this cannot happen to avoid problems relating to that. This would be good to add to the design goals of the library. So we cannot pass absolute paths because of that, but relative paths would be fine (even if they're a bit slower). Efficiency by itself is notably not a design goal of the library :) |
Description of changes
Currently, fileFilter only allows filtering based based on a files base name + file type.
This is a bit limiting if you want to include files based on the name of the directory they reside in.
I bumped in to this when using fileFilter to make a minimal set of files to make Cargo happy in a Rust workspace - specifically, I needed to pull in any
.rs
file that lived under abin/
,examples/
, orbenches/
folder before Cargo would stop complaining. (Note that I am trying to avoid pulling in all rust files - I'm working in a large Rust workspace but trying to package a single small member of the workspace).This change allows me to write filters like such as this:
Open questions
dir
be a Path (as implemented it is) or should it be converted to stringified path relative to the filter root?Things done
nix.conf
? (See Nix manual)sandbox = relaxed
sandbox = true
nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)Add a 👍 reaction to pull requests you find important.
Benchmark results
There is a preexisting benchmark script. I ran
./benchmark.sh master
and got the following results:This doesn't seem to have a large impact on performance if not using
dir