-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix test reordering for indirect parameterization #9350
Conversation
@@ -22,13 +22,13 @@ def checked_order(): | |||
assert order == [ | |||
("issue_519.py", "fix1", "arg1v1"), | |||
("test_one[arg1v1-arg2v1]", "fix2", "arg2v1"), | |||
("test_two[arg1v1-arg2v1]", "fix2", "arg2v1"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this is odd, because we have increased the number of setups/teardowns for fixture 2 with this change.
On main
, we have 2 s/t for fix1
, and 4 s/t for fix2
.
Here we have the same 2 s/t for fix1
, but now we have 8 s/t for fix2
. Seems like a regression for this case. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now we have 8 s/t for fix2
Yes, but we also had 8 s/t for fix2
even before this PR. fix2
is always (and correctly) recreated, because it has function scope. It's just not obvious from the assert statement. Tested it by adding print output pre and post the yield in foo{1,2}
. But please double check...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the same holds for fixture arg2
, it also has function scope and therefore shall always be recreated.
fix1
and arg1
have module scope, and both are successfully reused even after the PR, are they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh right my bad, missed the fact that fix2
is function scoped, so its order there doesn't matter as it will be recreated each time. 👍
Thanks @haxtibal! The PR looks great, but I left a comment on one of the tests you changed, please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM thanks!
It would be nice if another maintainer also reviewed this (@RonnyPfannschmidt @bluetech)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @haxtibal, interesting. This does seem to give better behavior.
To give a proper review, I would need to dig into the details here (like understand the surrounding code, and figure out why the param index was even used in the first place). That might take some time, so I wouldn't want to hold the PR for too long.
One possible problem with using the param value is that some params might be large (like large numpy arrays and what not) and comparing them might be slow. See for example 80d4dd6. We will need to consider if this is a problem for this case or not.
src/_pytest/fixtures.py
Outdated
@attr.s(auto_attribs=True, eq=False) | ||
class SafeHashWrapper: | ||
obj: Any | ||
|
||
def __eq__(self, other) -> Any: | ||
try: | ||
res = self.obj == other | ||
bool(res) | ||
return res | ||
except Exception: | ||
return id(self.obj) == id(other) | ||
|
||
def __hash__(self) -> Any: | ||
if isinstance(self.obj, Hashable): | ||
return hash(self.obj) | ||
return hash(id(self.obj)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions here (I haven't checked that they all work):
@attr.s(auto_attribs=True, eq=False) | |
class SafeHashWrapper: | |
obj: Any | |
def __eq__(self, other) -> Any: | |
try: | |
res = self.obj == other | |
bool(res) | |
return res | |
except Exception: | |
return id(self.obj) == id(other) | |
def __hash__(self) -> Any: | |
if isinstance(self.obj, Hashable): | |
return hash(self.obj) | |
return hash(id(self.obj)) | |
@attr.s(auto_attribs=True, eq=False, slots=True) | |
class SafeHashWrapper: | |
obj: object | |
def __eq__(self, other: object) -> bool: | |
try: | |
res = self.obj == other | |
return bool(res) | |
except Exception: | |
return id(self.obj) == id(other) | |
def __hash__(self) -> int: | |
if isinstance(self.obj, Hashable): | |
return hash(self.obj) | |
return hash(id(self.obj)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank's, applied all except obj: object
, because it makes mypy think obj
is always hashable and therefore we have an unreachable statement in SafeHashWrapper.__hash__
.
I think the type constraint I'd like to express for obj
is "type supports identity" and "type is not NoneType". But I don't know a way how, do you? EDIT: See python/mypy#11799.
A bit more OT, if isinstance(object(), Hashable)
and isinstance(dict(), object)
are True
, doesn't isinstance(dict(), Hashable) == False
break Liskov?
testing/python/fixtures.py
Outdated
@@ -1312,6 +1312,36 @@ def test2(no_eq): | |||
result = pytester.runpytest() | |||
result.stdout.fnmatch_lines(["*4 passed*"]) | |||
|
|||
def test_reorder_indirect(self, pytester: Pytester) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A docstring with a reference to the PR (since there is no issue) would be good.
BTW, if you explicitly want to test the reordering here and not the runtime behavior, I suggest running the pytester with --collect-only --quiet
which just spits the test order, and checking that, instead of doing prints and all that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicitly want to test the reordering here
Afaik pytest doesn't give guarantees about absolute test order, but only about certain constraints. So I intended to assert for the number of setups/teardowns to be minimal, not for a specific sequence. I think that's only possible if we actually let the fixture requests happen. They happen after collect.
But --collect-only
could still be good for kind of unit testing the reorder function, would such an additional test be useful?
Thanks for the review @nicoddemus and @bluetech
0cfd873 and #244 give some insight. Seems we switched from value to index in 2013 because unhashable params caused failure. This PR works around the unhashable problem by falling back to
Agreed. Comparing here means hashing, but that could be slow too. It's probably not too bad, but still... A way out could be to consider a user provided id, instead of calculating the hash (such a user id would also be better than the
They're special, as they're unhashable and won't compare to |
5a50852
to
dfc773e
Compare
They could :) dfc773e demonstrates how. Not yet code complete, but all tests pass. Just wanted to hear your opinion about the approach early. The benefit should be obvious from this example: @pytest.mark.parametrize("fix", [1, pytest.param({"data": 2}, id="2")], indirect=True)
def test1(fix):
pass
@pytest.mark.parametrize("fix", [pytest.param({"data": 2}, id="2"), 1], indirect=True)
def test2(fix):
pass Instead of subtle surprises triggered by implicit fallback to identity compare, a user can now explicitly control which How shall we proceed from here? EDIT: @nicoddemus Sorry, just realized you've already approved, and my experimental commit totally invalidates that approve. Now I reverted this PR to the conservative fix, and created a new branch feature/param_id_is_key for the experimental one. |
dfc773e
to
2fdbc3e
Compare
@haxtibal Using ids sounds interesting. I'm not sure if it can work, but if does, it would be an elegant solution to the problem. Your commit is hard to understand directly (because the underlying code is not the prettiest). What would be great is if you could:
|
Add a test to assert pytest-dev#8914 is fixed. The test assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)
Done, see #9420. It's now basically a combination of using parameter ids from API plus a fallback to the behavior introduced in #9350.
Did it as series of smaller commits, each with reasoning in commit message, hope that helps a bit.
All tests pass, "can work" is "yes" until proven otherwise😏 |
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)
reorder_items groups tests so that each group can share indirectly parameterized fixture instances. This optimizes for fewer fixture setup/teardown cycles. Prior to this commit, grouping considered an parameter's index, not the parameter value itself, to determine whether a parameter is "the same" and therefore can be shared. Relying on indexes is fast, however only works when there's exactly one parameter list for one fixture. If we provide multiple (possibly different) lists for the same fixture, e.g. by decorating test items, one index can no longer refer to "the same" parameter. In order to still be able to group for fixture reuse, we have to group by parameter value. Caution: The value ends up inside the key of another dict, and therefore must be hashable. This was true for indexes, but no longer is guaranteed for user provided values. A user may e.g. provide dicts or numpy arrays. The SafeHashWrapper ensures a fallback to id() in such a case. Fixes pytest-dev#8914.
2fdbc3e
to
4a314bf
Compare
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering and caching work as intended, and demonstrates how one can use parameter ids to decide about equality. Adapting issue_519.checked_order is not cheating. See our discussion at pytest-dev#9350 (review)
Agreed, let's close this in favor of #9420. The latter will bother maintainers. There's potential of breaking use cases or user expectations and I'm not in a position to foresee them all. I'd help where I can, but it's your turn to signal what's next. |
reorder_items
groups tests so that each group can share indirectly parameterized fixture instances. This optimizes for fewer fixture setup/teardown cycles. Prior to this PR, grouping considered an parameter's index, not the parameter value itself, to determine whether a parameter is "the same" and therefore can be shared.Relying on indexes is fast, however only works when there's exactly one parameter list for one fixture. If we provide multiple (possibly different) lists for the same fixture, e.g. by decorating test items, one index can no longer refer to "the same" parameter. In order to still be able to group for fixture reuse, we have to group by parameter value.
Caution: The value ends up inside the key of another dict, and therefore must be hashable. This was true for indexes, but no longer is guaranteed for user provided values. A user may e.g. provide
dict
s or numpy arrays. TheSafeHashWrapper
ensures a fallback toid()
in such a case.Fixes #8914.