Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change the whitelist file extension in readme #182

Closed
impredicative opened this issue Dec 11, 2019 · 18 comments
Closed

Change the whitelist file extension in readme #182

impredicative opened this issue Dec 11, 2019 · 18 comments

Comments

@impredicative
Copy link

impredicative commented Dec 11, 2019

The example vulture mydir --make-whitelist > whitelist.py in the readme typically creates an invalid Python file since it lacks imports. Such a file would create a problem for multiple other static analyzers that would find it to be grossly invalid Python code. It would be better if this example were updated to to use the extension .txt instead as below:

$ vulture mydir --make-whitelist > vulture.txt
$ vulture mydir vulture.txt

The # Vulture whitelist: header can be added to the created file. In this way, the Python imports can continue to be missing in this whitelist file, and the other static analyzers won't have any issue with it.


I'm actually quite displeased with the current end-user whitelist setup. It doesn't seem well thought through at all. If I use a .py whitelist file with the appropriate imports, then vulture complains that the imports are unused. If I use a .txt whitelist file, then it's not possible to define which file each exclusion is supposed to have been imported from, making it a non-specific exclusion. If for example, I specify "my_foo_pkg.my_sub_pkg.xyzz" in a .txt file, then "my_foo_pkg.my_sub_pkg" are not checked.

@RJ722
Copy link
Contributor

RJ722 commented Dec 15, 2019

Hello @impredicative,

Thanks for the suggestions.

If I use a .txt whitelist file, then it's not possible to define which file each exclusion is supposed to have been imported from, making it a non-specific exclusion. If for example, I specify "my_foo_pkg.my_sub_pkg.xyzz" in a .txt file, then "my_foo_pkg.my_sub_pkg" are not checked.

This is actually much more problematic, even with .py extension, and part of it has to do with the fact that as Vulture parses the ast, it only uses the "names" of the items for comparison, without any information about the ancestry of that node -- https://github.com/jendrikseipp/vulture/blob/master/vulture/core.py#L50-L54

Let me give a more concrete example of this problem:

├── foo
│   ├── __init__.py
│   ├── bar
│   │   ├── __init__.py
│   │   └── ex1.py
│   └── ex2.py
└── whitelist.py
# foo/bar/ex1.py
def unused_func():
    pass
# foo/ex2.py
def unused_func();
    pass

Running vulture foo gives the following results:

foo/bar/ex1.py:1: unused function 'unused_func' (60% confidence)
foo/ex2.py:1: unused function 'unused_func' (60% confidence)

For ignoring only foo.bar.ex1.unused_func, whitelist file would look like:

# whitelist.py
from foo.bar import ex1

ex1.unused_func

But, vulture foo whitelist.py is clean - ignores both of them.

To counter this, Vulture would need to not only compare the names of the nodes, but also the associated metadata (mainly the chain of parents' name), which would be slower, but more specific.

FWIW, I'm actually in favor of the .txt (or a custom .whitelist extension).

@RJ722
Copy link
Contributor

RJ722 commented Dec 15, 2019

Also, one other fool-proof way to handle ignoring stuff is the flake8 plugin (#161), which given the interest of users, I'm now escalating from "might-do" to "will-do-in-next-week-as-soon-as-my-exams-end".

@impredicative
Copy link
Author

impredicative commented Dec 15, 2019

Also, all other major static analyzers support in-line comments and exclusions, just not vulture. There is much value in supporting an in-line exclusion comment, e.g. "# vulture-off" for all the nodes in that line only. This comment would also have to be compatible with other comments for other static analyzers that could exist in the same line.

@RJ722
Copy link
Contributor

RJ722 commented Mar 31, 2020

Hi @impredicative, now that #195 has been merged, and you can use # noqa, I think we can close this.

@impredicative
Copy link
Author

I would never use # noqa for this. No QA means to ignore all QA tools. Why would I want to ignore all QA tools for the line? It's fine that # noqa is supported, but what does it have to do with my more specific suggestion to support # vulture-off (as it doesn't harm other tools)?

@RJ722
Copy link
Contributor

RJ722 commented Mar 31, 2020

If you want to be more specific and only turn-off Vulture related issues, you can use # noqa: V10X (issue codes in README).

@impredicative
Copy link
Author

impredicative commented Mar 31, 2020

That's a vulture-specific syntax, but other tools may not read that syntax, and may still disable checking for the line entirely. It's risky for me to have the word "noqa" in a comment.

@RJ722
Copy link
Contributor

RJ722 commented Mar 31, 2020

I understand your point, but we chose to go with # noqa, as it is already pretty-standardized across many popular qa tools like flake8, pycodestyle, etc. whose standards Vulture closely mimics.

@impredicative
Copy link
Author

impredicative commented Mar 31, 2020

Just because some other tools are doing things the wrong way doesn't mean this tool should too. Please compare with tools like black and pylint, both of which have more specific syntax for ignoring lines or sections.

Just think about it. If I want to disable a tool X for a line, why does this have to mean that I should also disable all tools that are not X? Is the logic not clear?

Going with # noqa doesn't prohibit support for something better.

@jendrikseipp
Copy link
Owner

Please note that # noqa V10X is essentially the same as # vulture-off, but allows more fine-grained control by selecting the error codes to ignore.

@impredicative
Copy link
Author

impredicative commented Mar 31, 2020

@jendrikseipp I understand that perfectly, but I also know that not all QA tools will process "V10X" as vulture expects it. There is a high risk that various other QA tools will simply skip all checks for the line after seeing the word "noqa".

@impredicative
Copy link
Author

impredicative commented Mar 31, 2020

The pattern that pylint uses is # pylint: disable=some-named-err1,some-named-warn1, and this approach is perfect.

@The-Compiler
Copy link
Contributor

The-Compiler commented Mar 31, 2020

flake8 (which is where the # noqa syntax originates from, as far as I can tell) also supports # noqa: <error code>.

There is a high risk that various other QA tools will simply skip all checks for the line after seeing the word "noqa".

That would then be a bug in that tool. Note that it'd be incompatible with flake8 as well, which is probably one of the more common Python QA tools.

As another data point: It'd probably be incompatible with thousands of projects using that syntax.

@impredicative
Copy link
Author

impredicative commented Mar 31, 2020

Hi @impredicative, now that #195 has been merged, and you can use # noqa, I think we can close this.

What about the whitelist file extension? If it's going to remain .py, it must be a valid importable Python file, otherwise it should be a .txt file. I do think the example in the readme should be updated to use .txt.

@jendrikseipp
Copy link
Owner

I see your point, but still think that .py is more informative about the contents of the file than .txt. I added the following note to the README file:

Note that the resulting `whitelist.py` file will contain valid Python
syntax, but for Python to be able to *run* it, you will usually have to
make some modifications.

@impredicative
Copy link
Author

impredicative commented Apr 2, 2020

Note that the resulting whitelist.py file will contain valid Python syntax, but for Python to be able to run it, you will usually have to make some modifications.

Valid Python doesn't need modifications to run. Specifically, there are no imports in whitelist.py, and so the file cannot run. Even if it did have imports and could run, numerous other static analyzers would have a problem with it due to unused imports. Using a .txt extension addresses all of these problems. As it stands, the readme is not only a lie, it is also self-contradicting.

@impredicative
Copy link
Author

impredicative commented Apr 2, 2020

As an aside, the # noqa: <error code> pattern is a really bad idea because no one cares to remember what these error codes mean. The # noqa: <error-name-1>,<error-name-2> pattern is better if each error has a brief recognizable name, making it easier to memorize and reuse. Pylint does this better. In any case, supporting error codes doesn't prohibit support for error names.

@jendrikseipp
Copy link
Owner

I agree that adding support for speaking names to the noqa comments sounds like a good idea. However, my hope is that people will use whitelists more commonly than noqa comments since the latter add noise to the source code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants