-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve coverage of argparse
module
#103558
Comments
I'll work on this. |
I think that for something like argparse, "idiom coverage" is more important than line coverage or branch coverage. By "idiom coverage", I mean "Are there any reasonably common uses of argparse that aren't currently tested in the test suite?" The reason why it's currently so difficult to make changes to argparse is that many users have found ways of using the module that weren't anticipated when the code was written, and have grown to assume that these uses are supported because of the fact that the code has remained unchanged for so long. (Obligatory reference to Hyrum's Law.) Seemingly innocuous bugfixes to argparse often end up breaking these "unintended APIs", causing a lot of pain for users and making development of the module difficult. Unfortunately, "idiom coverage" is quite difficult to measure compared to line coverage or branch coverage. One possible way of working on improving this might be to look at popular PyPI projects that use argparse, and see if any of them use the module in unusual/advanced ways that aren't currently exercised in the argparse test suite. |
@AlexWaygood this is the next step :) |
From my experience working on this, "idiom coverage" without proper tools would be extremely difficult. To find gaps between the current PyPI project usage and our test suite, we need to know:
Neither is trivial to know. Do we manually understand them using our human brains? That seems overwhelming. Their test coverage could be poor - the actual corner case is highly possibly not tested. No one knows exactly we have tested. So if we want to do this, we need advanced tools. Line coverage and branch coverage is not golden but it's something that we can rely on, more importantly, it's feasible. We can do another step forward - stack coverage. Record all the appeared stack states (inside This method would still depend on the fact that the PyPI projects have decent tests, but combining a lot of them, we should have a pretty good coverage (compared to reading the docs and interpreting them). The only problem is - we don't have the tool yet. |
To be clear, I think working on improving line coverage and branch coverage is very useful! And I completely agree that "idiom coverage" is much harder. It's fine to work on line/branch coverage as a first step. But as the stats show, argparse's line coverage is already very high -- and from the history of bug reports relating to this module, I think "idiom coverage" is really what's lacking here. |
About the stack coverage I mentioned above. I did a really quick prototype on it. I ran our current test suites, then I tried a recent issue #103498. The results shows that this stack(among with some others) appeared in the code given in the prototype, but not in our test suite: ['parse_args:1869', 'parse_known_args:1906', '_parse_known_args:2142'] And that is exactly where the problem is - we never tested the combination of With a larger test base from other projects, we will have a much better idea of the stacks that we have never tested before, we can even do a frequency label on the uncovered stack to check if there's a common combination that we missed. Of course, this is super hacky now and not completely ready to use even not for production, but I guess at least this shows a possibility - even for other complicated and heavily used libraries. |
Found use-cases to cover(Feel free to ping me, so I will update the list)
|
Actually, I realized that we don't even need the outside tests to find missing stacks. Our code coverage is high, which means almost all the lines are covered somehow. So we have a local calling relation - say For example, with a small prototype, I realized if self.exit_on_error:
try:
namespace, args = self._parse_known_args(args, namespace)
except ArgumentError as err:
self.error(str(err))
else:
namespace, args = self._parse_known_args(args, namespace) Also found some false alarms :) It's not perfect and it still requires some human analysis, but I think it's at least a tool to move forward, as compared to we just try to look at 5000+ lines of testing code and figure out what's missing. |
In the case of
Changing how I don't know if correcting coverage as proposed here will cover these kinds of problems. |
Co-authored-by: Shantanu <[email protected]> Co-authored-by: hauntsaninja <[email protected]>
Thanks! |
We tackled the low-hanging fruit here, but AFAIK we never got round to any of the more advanced ways of analysing coverage, as floated in #103558 (comment). #103558 (comment) still has an unchecked box, as well. So maybe there's still some stuff to do here. Having said that, I don't know if @sobolevn or @gaogaotiantian are still interested in working on this (I probably won't be ;). If neither has any plans to work on this, we may as well leave this closed for now. |
I believe the major issue with |
Since none of currently active core-devs have deep expertise in
argparse
, but we still need to fix bugs in this module. The best way to start is to add more tests and increase the coverage of this module.Right now it has a pretty solid coverage of 97%
You can get this yourself:
Here's the initial (current) state report. argparse_coverage.tar.gz
Any help is welcome :)
Linked PRs
The text was updated successfully, but these errors were encountered: