Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider auto-excluding files matched by gitignore files #174

Closed
Jackenmen opened this issue Sep 13, 2022 · 12 comments · Fixed by #1234
Closed

Consider auto-excluding files matched by gitignore files #174

Jackenmen opened this issue Sep 13, 2022 · 12 comments · Fixed by #1234
Assignees

Comments

@Jackenmen
Copy link
Contributor

Patterns would need to be read from:

  • .gitignore file in the same directory as the path, or in any parent directory (the latter allows tools to put a .gitignore file with * in the directory they create to auto-exclude themselves)
  • patterns read from $GIT_DIR/info/exclude (allows user-specific patterns for a specific repository)
  • patterns read from the file specified by Git's configuration variable core.excludesFile (allows global user-specific patterns, I personally use this one)

Reference: https://git-scm.com/docs/gitignore

This might come at a performance cost but it definitely would be a useful feature.

@charliermarsh
Copy link
Member

(Working on this.)

@sgryjp
Copy link
Contributor

sgryjp commented Sep 17, 2022

Hi! I thought using ignore crate instead of walkdir+gitignore can be an option. The crate is used inside very popular and performant tools such as ripgrep and fd so we can expect reliability and performance at the same time. It also allow us to choose UNLICENSE so you will have no license conflicts. Could you consider using it?

@messense
Copy link
Contributor

+1 for this.

As for using the ignore crate, IMO it should work generally, but be aware that it's not 100% compatible with git ignore, see PyO3/maturin#885 (comment)

@Jackenmen
Copy link
Contributor Author

I like that the ignore crate has all of the functionality I described in the issue description built-in - it supports .gitignore files at all directory levels of the repository, supports .git/info/exclude, and supports global gitignore!

There are many rules that influence whether a particular file or directory is skipped by this iterator. Those rules are documented here. Note that the rules assume a default configuration.
[...]

  • Second, ignore files are checked. Ignore files currently only come from git ignore files (.gitignore, .git/info/exclude and the configured global gitignore file), plain .ignore files, which have the same format as gitignore files, or explicitly added ignore files. The precedence order is: .ignore, .gitignore, .git/info/exclude, global gitignore and finally explicitly added ignore files. Note that precedence between different types of ignore files is not impacted by the directory hierarchy; any .ignore file overrides all .gitignore files. Within each precedence level, more nested ignore files have a higher precedence than less nested ignore files.
    [...]

https://docs.rs/ignore/latest/ignore/struct.WalkBuilder.html#ignore-rules

(reading of .ignore files can be disabled separately from the other functionality in case it's something you don't want)

@charliermarsh
Copy link
Member

Definitely happy to support this! I just haven't personally prioritized the work since most exclusions are doable with the current configuration settings, even if less convenient. But we should do it.

@Jackenmen
Copy link
Contributor Author

Personally I'm mostly missing a way to ignore files only on my system, in projects where I don't have significant enough stake to change the exclusions they've set up. At best I could maybe make alias ruff to ruff --extend-exclude files,folders or something like that but that probably overrides the configuration from the pyproject.toml so it wouldn't really work universally across all projects. Meanwhile with ignore files I can either put a file/directory in global gitignore, make sure that there's .gitignore with * in a directory I want to ignore, or I can put a file/directory in .git/info/exclude.

@jhallard
Copy link

Just a slight bump on this, we've had some issues sneak in with the pyproject.toml and .gitignore lists getting out of sync which caused us to accidentally lint a GB or so of third party code, would be convenient if those automatically sync'd.

@charliermarsh
Copy link
Member

(I still want to explore this, and do so via the ignore crate. Just need to find time.)

@charliermarsh charliermarsh self-assigned this Dec 13, 2022
@charliermarsh
Copy link
Member

I'll give this a try (but my newborn just came home today so bear with me :))

@jhallard
Copy link

@charliermarsh 1. Congratulations!! 🎉 2. you're a beast. We're not blocked on this by any means so take care of yourself!

@charliermarsh
Copy link
Member

@jhallard - Thank you so much! I just couldn't help but share :)

@charliermarsh
Copy link
Member

This is going out in the next release: #1234.

The behavior is documented in the README (and in BREAKING_CHANGES.md), and there's an opt-out setting (respect-gitignore = false).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants