-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add --exclude-regex and --no-make-paths-absolute to exclude specific file paths #115
Conversation
NOTE: tests are broken here because [!/] doesn't work the way I thought it would. The next commit will change the glob option to use a regex.
- tests pass now - interface more familiar for windows users
I ran |
Thank you for the PR! I will look into it after work today. From a birds eye view, it does make sense what you want to achieve, though. Regarding the preexisting fixes of linters, you probably didn't use the virtual environment installed versions. So they'd be too new, is my guess. |
Yes, you've used "too new" linters. Those fixes I have in certain branches for version 1.6 release, while 1.5 release (the next one) will not yet include those. If you could run |
Looks like I didn't read CONTRIBUTING.md close enough; thanks so much for debugging that for me! I'll do that and make sure it passes tests. |
6cdedfc
to
da8b2ba
Compare
Add pylint overrides for tests which access the private memoized master regex. I think these tests are necessary if we think memoizing the regex is a useful optimization, but this may well be premature optimization.
da8b2ba
to
27302a3
Compare
Using the venv from the instructions in CONTRIBUTING.md worked immediately! |
The test failure (https://github.com/netromdk/vermin/actions/runs/3187499365/jobs/5199138152) appears to be spurious:
|
I want to highlight the final commit in this PR: 27302a3. It removes the straightforward loop over regex matches and replaces it with a memoized regex created by joining the individual I'm not sure what approach is best for this project. I think unless I can show a benchmark demonstrating a speedup with the memoized master regex in 27302a3, then we should keep it; otherwise, we can revert it to make this change a bit easier to review. |
Ok, when I create a directory with 1,000,000 files and run vermin with 10 separate The optimization improved the runtime of scanning 1 million files from 3.202 seconds to 2.654 seconds, a speedup of 0.548 seconds ~= 17.1%. However, scanning file paths appears to already be well optimized, and I strongly suspect that actually executing vermin over all those files would take much more time than scanning file paths, so I would much prefer to avoid doing fancy regex tricks to avoid having to fix bugs for weird edge cases when joining regexps. My benchmark was (while on 27302a3): ; mkdir tmp/
; pushd tmp
; seq 1000000 | parallel -n200 -j16 touch {}
; popd
; time ./vermin.py -vvvv -t=2.7 -t=3.5- --no-make-paths-absolute --exclude-regex '1' --exclude-regex '2' --exclude-regex '3' --exclude-regex '4' --exclude-regex '5' --exclude-regex '6' --exclude-regex '7' --exclude-regex '8' --exclude-regex '9' --exclude-regex '0' tmp/
Detecting python files..
No files specified to analyze!
./vermin.py -vvvv -t=2.7 -t=3.5- --no-make-paths-absolute --exclude-regex '1' 2.38s user 0.33s system 102% cpu 2.654 total
; git checkout HEAD~1
; time ./vermin.py -vvvv -t=2.7 -t=3.5- --no-make-paths-absolute --exclude-regex '1' --exclude-regex '2' --exclude-regex '3' --exclude-regex '4' --exclude-regex '5' --exclude-regex '6' --exclude-regex '7' --exclude-regex '8' --exclude-regex '9' --exclude-regex '0' tmp/
Detecting python files..
No files specified to analyze!
./vermin.py -vvvv -t=2.7 -t=3.5- --no-make-paths-absolute --exclude-regex '1' 2.93s user 0.33s system 101% cpu 3.202 total |
…terns" This reverts commit 27302a3.
I can see the need to solve this issue with
This is definitely not a nice solution, I agree! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great stuff! 💪🏻
Only a few small tweaks to be done.
The additions to vermin/config.py
should also be added to the sample.vermin.ini
, and be added as test cases, too, in tests/config.py
like testing parsing of exclusion regexes and make paths absolute.
It's a nice optimization. We might do that later but, as you said, if you detect 1 million files, the analysis of them will be a lot more than the optimized-away 0.5 seconds. |
Yeah, I think so. I've never had that myself so it's a little weird. Maybe it'll go away soon. |
Thanks so much!! Vermin is a really really useful tool :D
I will add this now! Please also take a look at my response to your comment about the No rush to reply! |
- also add 'yes'/'no' test cases for other boolean config flags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! And also nice you added the extra cases with yes
and no
. 🤝
Oh, right. Wrt. errors like https://github.com/netromdk/vermin/actions/runs/3206864588/jobs/5242230923#step:7:45 Instead of testing for the compiled patterns, maybe we check the patterns before compiling them instead? Like perhaps: --- a/tests/config.py
+++ b/tests/config.py
@@ -345,21 +345,22 @@ exclusion_regex =
""", []],
[u"""[vermin]
exclusion_regex = \\.pyi$
-""", [re.compile(r"\.pyi$")]],
+""", [r"\.pyi$"]],
[u"""[vermin]
exclusion_regex = \\.pyi$
^a/b$
-""", [re.compile(r"\.pyi$"), re.compile(r"^a/b$")]],
+""", [r"\.pyi$", r"^a/b$"]],
[u"""[vermin]
exclusion_regex =
^a/b$
\\.pyi$
-""", [re.compile(r"\.pyi$"), re.compile(r"^a/b$")]],
+""", [r"\.pyi$", r"^a/b$"]],
])
def test_parse_exclusion_regex(self, data, expected):
config = Config.parse_data(data)
self.assertIsNotNone(config)
- self.assertEqual(config.exclusion_regex(), expected)
+ patterns = [regex.pattern for regex in config.exclusion_regex()]
+ self.assertEqual(patterns, expected) |
fb9c699
to
d8726ec
Compare
I don't know the coverage API keeps failing all of a sudden. I'm going to test it myself and then merge when they succeed. |
There we go:
Have to handle paths on Windows with |
I think I fixed it in 661738f! Please let me know if you'd prefer me to use |
Thanks. That's fine but there's a problem:
https://github.com/netromdk/vermin/actions/runs/3209342269/jobs/5245969385#step:7:71 |
661738f
to
1f5af7a
Compare
I tried to fix it using a combination of |
Sorry, now it seems to be getting close: https://github.com/netromdk/vermin/actions/runs/3209386859/jobs/5246055387#step:7:71
|
1f5af7a
to
d2025f4
Compare
Same thing, unfortunately: https://github.com/netromdk/vermin/actions/runs/3209397931/jobs/5246076076#step:7:71 |
d2025f4
to
f2d46d2
Compare
The problem is that |
0ffbc5f
to
4b5ba31
Compare
4b5ba31
to
8dfffa8
Compare
There's a stray 'x':
|
8dfffa8
to
31f50e3
Compare
Okay, same: https://github.com/netromdk/vermin/actions/runs/3209473950/jobs/5246216990#step:7:71 Need another approach here. The problem is something else. |
I agree! May have to call it quits for tonight, it's quite late here...thanks so much for your prompt attempts to help! I can finish this up this weekend. |
No worries. Thank you for all the hard work! :) |
hey @cosmicexplorer . should we try to get this one done? :) |
Hello @netromdk: yes absolutely, and thanks so much for continuing to follow up here! I had a very eventful couple weeks but will page back in now. Thanks again for the ping. |
Ok, I finally have a repro for the test failure on a google cloud windows vm! Will definitely be able to figure this out quickly now. |
31f50e3
to
a682def
Compare
Ok, using I was worried that this would somehow mean windows users would have to add extra backslashes to their command lines, but I just tried it and there's no issue! See: PS C:\Users\danieldmcclanahan\vermin> mkdir tmp
PS C:\Users\danieldmcclanahan\vermin> cd tmp
PS C:\Users\danieldmcclanahan\vermin> echo "print('this is code')" > wow.py
PS C:\Users\danieldmcclanahan\vermin> mkdir a
PS C:\Users\danieldmcclanahan\vermin> echo "print('this is code')" > wow.py
PS C:\Users\danieldmcclanahan\vermin> cd ../..
PS C:\Users\danieldmcclanahan\vermin> cmd /c .\vermin.py -vvv -t=2.7 -t=3.5- tmp
Detecting python files..
Analyzing 2 files using 16 processes..
...
(without any exclusions, it detects 2 files)
PS C:\Users\danieldmcclanahan\vermin> cmd /c .\vermin.py -vvv -t=2.7 -t=3.5- --no-make-paths-absolute --exclude-regex "^^tmp\\a\\[^^\\]+\.py$" tmp
Detecting python files..
Analyzing using 16 processes..
...
(this output on windows means it only sees one file)
PS C:\Users\danieldmcclanahan\vermin> cmd /c .\vermin.py -vvv -t=2.7 -t=3.5- --no-make-paths-absolute --exclude-regex "^^tmp\\.+\.py$" tmp
Detecting python files..
No files specified to analyze! I was thinking about adding additional help examples for windows users, but since they don't have to do anything special except escape the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Thank you, @cosmicexplorer :)
There was some weirdness to the tests, as you know, but I've verified it for myself and your changes are merged! Thanks again! Since Python 3.11 has been released I'm going to release Vermin 1.5 very soon now. |
Problem
.pyi
files are not executed by a python interpreter, so they will often have different python version constraints than python source files (indeed, this is the reason they exist). However,vermin
considers.pyi
files python source code, so it will fail.pyi
files for things likeimport typing
(which is the sole reason.pyi
files are used) if the rest of the project must still be compatible with 2.7, for example.While the existing
--exclude
option can avoid checking files given their module path, this fails when.pyi
files are used, since.pyi
files must have the same module path as the.py
source file they provide types for. If the existing--exclude
option is used to exclude a specific.pyi
file, that also excludes the corresponding.py
file fromvermin
analysis.In order to use
.pyi
files withvermin
in spack/spack#32919, I couldn't figure out anything more elegant than to temporarily rename all.pyi
files before runningvermin
in our CI script: https://github.com/spack/spack/pull/32919/files#diff-200bb42ff2b106936781e2d2fd613af62e85d789323dd12eb370a1ce946fab44.Solution
[--exclude-regex <regex pattern>] ...
to match file or directory paths which will be excluded fromvermin
analysis.--no-make-paths-absolute
to allow--exclude-regex
patterns to be relative to the directoryvermin
is invoked from.Alternatives Considered
# novm
that would apply to the entire file the wayflake8
supports# flake8: noqa
, but that would have precluded the use of--no-parse-comments
, for example.mypy
maintainer pointed to its config option which sets configuration per-module: Add ability to ignore a file or directory without modifying it python/mypy#4675 (comment). This is what--exclude
already provides, but as described above it fails in the case of.pyi
and.py
files with the same name (the typical use case).Result
If I want to add the first
.pyi
file to a project usingvermin
that requires 2.7 compatibility, I can simply modify thepyproject.toml
to add:without having to change any command lines.