-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start moving to pytest #1723
Start moving to pytest #1723
Conversation
…est trying to run it
Actually, I got xdist to work (see pytest-dev/pytest-xdist#72). Finished all the tests in 32 seconds across 3 cores! |
Awesome, thanks! Really looking forward to completing this move and also to running the unit tests in parallel. One thing that's useful both in the final commit message (for people reading the change later to understand it) and in the process of code review (for reviewers to understand the changes) is to write up in one place a description of everything that's changing and why. If you look at recent commits, most of them that aren't trivial changes are good examples; in particular see these by Guido, me, David, and Reid. The way these work logistically in our GitHub workflow is that when one of us is ready to hit the "Merge pull request" button, we'll get a textarea right there for the final commit message. It'll be pre-filled with all your commit messages from the branch, and typically we'll take the relevant parts of that and also copy in relevant paragraphs from your PR message and any comments. So the message when first opening the PR is a great place to put this description, and when you've already opened a PR a comment is a fine place to add it. Would you write one now? I'm also just about to read through the code. |
|
||
$ ./runtests.py pytest -t -k -t NestedListAssignment | ||
$ py.test -k NestedListAssignment | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"To run these, you must use pytest instead", sounds like these tests might not run if you just say ./runtests.py
. They do, right?
Maybe a clearer description here would be something like "To run these unit tests individually, you can run py.test
directly, or pass inferior arguments with -t
." (I think you mean -t
rather than -k
-- is that right or am I misreading the example?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
(Going afk for a bit, will read the rest later. Thanks again!) |
@@ -209,3 +206,5 @@ def parse_flags(self, program_text: str) -> List[str]: | |||
return m.group(1).split() | |||
else: | |||
return [] | |||
|
|||
TestTypeCheck.setup_tests() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, typically the magic of pytest makes it so you don't need to make this kind of call up at toplevel. Is there another way we can prepare these?
Also, "setup" in the pytest context usually refers to something you do just before running some tests in order to set up some state that the test itself, or the application code it's testing, expect in order for the test to run. It's not normally something you'd want to do at toplevel like this. Is "collect" or "test collection" more what this step is about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
GitHub is indicating in the status box at the bottom of this PR page that there are "conflicts". I think these are easy -- Git actually handles everything automatically when I locally try a rebase or merge from your branch, and the only thing it even mentions having to do anything about is that there were some changes in So you'll want to rebase eventually, but there's no rush -- actually I think it's best if you hold off to rebase until the PR is otherwise nearly ready to merge, because if you do I think GitHub will get confused into thinking all the comments on specific lines are on "obsolete" versions of your changes and will hide them. |
As it happens, I spent a number of hours last week becoming more familiar with the internals of pytest and with how to configure it -- two subjects which are unfortunately hard to separate from each other, because one thing pytest really is not good at is documentation -- than I'd ever really desired to be, for the sake of some non-mypy-related stuff at work. So I may be able to make some pretty specific comments and suggestions on how to manipulate pytest into doing exactly what we want here as smoothly as possible. Let's see. |
BTW, I really like how you've set things up so that the various |
Let's zoom out a bit to discuss what we'd like this all to look like in pytest terms. (Once we have a clear picture of this which we'd like, this would be a great thing to explain in the commit message! Or in fact, the parts of it that are directly about the code as it ends up, rather than the relationship to myunit, will be good for a long comment and/or docstring somewhere, perhaps around As you might already know from whatever study you've made of how pytest works, it operates basically in two phases: first it "collects" a bunch of test "items", then it runs those items. Collection is a recursive process: it descends the filesystem tree recursively looking for files that appear to be test files, and invokes appropriate collectors for each of those; for a Python file, the usual collector will import it as a module and then recursively collect promising functions and classes in the module, and the usual collector on a class will recurse further. Every step of this process can be hooked into and modified. What we want here is to hook into the point where pytest tries to collect one of our So it adds an implementation of the hook called Here, I think you'd basically want that hook implementation to end up calling your That will fit in better with the normal pytest flow of things, and will let you avoid calling |
Also, I think I see now why your Then in the pytest world, when your |
Haven't yet read the |
Is this stalled? Maybe it would be easier to get this moving if it wasn't such a massive set of diffs? |
@gvanrossum Sorry for the delay! I just fixed most of the issues mentioned. @gnprice Two things:
|
Thanks for the update! Reading in detail now. |
|
||
def pytest_pycollect_makeitem(collector, name, obj): | ||
if (inspect.isclass(obj) and issubclass(obj, PytestSuite) and | ||
obj is not PytestSuite): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you avoid the and obj is not PytestSuite
part here? A special case like this always makes me a little uneasy that there's a bug, or at least a gotcha in the API, hiding somewhere.
Ah, in fact I think what it would take to make this work cleanly without that condition is just to add a cases
method right on the PytestSuite
base class that just returns an empty list. Will continue in an inline comment over there for better context.
So here's some reading, and thoughts, on how to do this test collection using the pytest API as documented. As I said above, I think the existing core logic will do fine and it shouldn't be too complicated to put it inside that API. There are two relevant examples in the pytest codebase itself, for doctest and unittest (i.e. the stdlib's The unittest module seems to have a bunch of mess in it at this point, most of that apparently due to wanting to support The doctest example is perhaps simpler -- in So in brief, I think it'd go like
|
Sure! I guess this is especially relevant after my last comment. :-) Yep, as I said in that comment just now, I'd have a method like About cutting out Let's take our favorite example: Another way this could be done is: just as This saves a layer of indirection in running the test. It also saves a parameter in the rather complicated signature of Some of the other drivers don't currently have a Does that help? Are there aspects of this that still don't make sense? Happy to discuss further. Looking forward to seeing this migrated! |
I finally got around to trying this. The general py.test workflow looks great, thanks for working on this! There are few details that probably need some more work, discussed below. Some of them were also discussed previously, but I'm repeating them here as I may have a slightly different take on things. In general I'd like to get the most important details right before merging this. The previous big test runner revamp from 2015 introduced a bunch of regressions and slowed down productivity for a long while, and some of the regressions still haven't been fixed. If we merge a not-quite-fully finished workflow, it has a tendency to become the norm and developers just get used to it, and nobody has enough incentive or time to further improve things. I've seen this happen over and over again :-( Just preserving existing functionality and important UX niceties is good enough -- no need to implement new stuff such as unified context diffs. 1) Displaying actual/expected output I want to preserve an (almost) verbatim (lightweight diff markers are okay) actual/expected output in the py.test output, even when running without Specifics:
Editing this to a clean test output requires a few too many steps to my liking (also, the non-verbose output isn't very useful). 2) Parallel tests I can't get the xdist plugin to work for running tests in parallel. This is what I get when I run
3) Stack trace formatting I'd prefer the short stack trace format by default that just shows the line where the failure happened instead of the full function body. I have this enabled on other projects that I work on that use py.test and I think that the short format stack trace is better, at least for running tests interactively. In Travis CI it may make sense to include the long stack trace format for more context, but not sure about that. This is particularly relevant for mypy as for some of the most common test failures (unexpected test case output only) the stack trace isn't very useful. 4) -i and -u options These options automatically change test case descriptions based on actual output, and these have saved a lot of work in the past when doing large-scale changes to error messages. Not sure if these still work, but it would be really nice to still have these. |
This PR has been quiet a few weeks -- @kirbyfan64, I assume you've become busy for a bit -- and I wouldn't want this work to go to waste. So I spent a few hours today putting together a new attempt, #1944. It takes ideas from @kirbyfan64's original PR here, with adjustments following my suggestions above, and also taking into account @JukkaL's request just above to pay close attention to putting similar information and verbosity in the output to what myunit does. I believe if we merge that it will largely replace this PR, but there'll be plenty of follow-on work to do to complete #1673. |
…over to it (#1944) This is a step toward #1673 (switching entirely to pytest from myunit and runtests.py), using some of the ideas developed in @kirbyfan64's PR #1723. Both `py.test` with no arguments and `py.test mypy/test/testcheck.py` work just as you'd hope, while `./runtests.py` continues to run all the tests. The output is very similar to the myunit output. It doesn't spew voluminous stack traces or other verbose data, and it continues using `assert_string_arrays_equal` to produce nicely-formatted comparisons when e.g. type-checker error messages differ. On error it even includes the filename and line number for the test case itself, which I've long wanted, and I think pytest's separator lines and coloration make the output slightly easier to read when there are multiple failures. The `-i` option from myunit is straightforwardly ported over as `--update-data`, giving it a longer name because it feels like the kind of heavyweight and uncommon operation that deserves such a name. It'd be equally straightforward to port over `-u`, but in the presence of source control I think `--update-data` does the job on its own. One small annoyance is that if there's a failure early in a test run, pytest doesn't print the detailed report on that failure until the whole run is over. This has often annoyed me in using pytest on other projects; useful workarounds include passing `-x` to make it stop at the first failure, `-k` to filter the set of tests to be run, or (especially with our tests where errors often go through `assert_string_arrays_equal`) `-s` to let stdout and stderr pass through immediately. For interactive use I think it'd nearly always be preferable to do what myunit does by immediately printing the detailed information, so I may come back to this later to try to get pytest to do that. We don't yet take advantage of `xdist` to parallelize within a `py.test` run (though `xdist` works for me out of the box in initial cursory testing.) For now we just stick with the `runtests.py` parallelization, so we set up a separate `py.test` command for each test module.
Ref. #1673.
Sorry this took so long! I was busy (ACT)/distracted (E3)/everything else...
In addition, I ended up moving over the command-line tests and semanal tests as well as the check tests, since it was pretty trivial.
Also, I noticed the comment in
runtests.py
saying that the command-line tests took longer than regular unit tests. Why is that? They only take 7s, vs around 45s for everything else. Not a significant difference.Unfortunately, I can't actually try this in parallel because I think I keep hitting pytest-dev/pytest#596...