Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test reordering for indirect parameterization #9350

Closed

Conversation

haxtibal
Copy link
Contributor

reorder_items groups tests so that each group can share indirectly parameterized fixture instances. This optimizes for fewer fixture setup/teardown cycles. Prior to this PR, grouping considered an parameter's index, not the parameter value itself, to determine whether a parameter is "the same" and therefore can be shared.

Relying on indexes is fast, however only works when there's exactly one parameter list for one fixture. If we provide multiple (possibly different) lists for the same fixture, e.g. by decorating test items, one index can no longer refer to "the same" parameter. In order to still be able to group for fixture reuse, we have to group by parameter value.

Caution: The value ends up inside the key of another dict, and therefore must be hashable. This was true for indexes, but no longer is guaranteed for user provided values. A user may e.g. provide dicts or numpy arrays. The SafeHashWrapper ensures a fallback to id() in such a case.

Fixes #8914.

@@ -22,13 +22,13 @@ def checked_order():
assert order == [
("issue_519.py", "fix1", "arg1v1"),
("test_one[arg1v1-arg2v1]", "fix2", "arg2v1"),
("test_two[arg1v1-arg2v1]", "fix2", "arg2v1"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is odd, because we have increased the number of setups/teardowns for fixture 2 with this change.

On main, we have 2 s/t for fix1, and 4 s/t for fix2.

Here we have the same 2 s/t for fix1, but now we have 8 s/t for fix2. Seems like a regression for this case. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now we have 8 s/t for fix2

Yes, but we also had 8 s/t for fix2 even before this PR. fix2 is always (and correctly) recreated, because it has function scope. It's just not obvious from the assert statement. Tested it by adding print output pre and post the yield in foo{1,2}. But please double check...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the same holds for fixture arg2, it also has function scope and therefore shall always be recreated.

fix1 and arg1 have module scope, and both are successfully reused even after the PR, are they?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh right my bad, missed the fact that fix2 is function scoped, so its order there doesn't matter as it will be recreated each time. 👍

@nicoddemus
Copy link
Member

Thanks @haxtibal!

The PR looks great, but I left a comment on one of the tests you changed, please take a look?

Copy link
Member

@nicoddemus nicoddemus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

It would be nice if another maintainer also reviewed this (@RonnyPfannschmidt @bluetech)

Copy link
Member

@bluetech bluetech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @haxtibal, interesting. This does seem to give better behavior.

To give a proper review, I would need to dig into the details here (like understand the surrounding code, and figure out why the param index was even used in the first place). That might take some time, so I wouldn't want to hold the PR for too long.

One possible problem with using the param value is that some params might be large (like large numpy arrays and what not) and comparing them might be slow. See for example 80d4dd6. We will need to consider if this is a problem for this case or not.

Comment on lines 244 to 256
@attr.s(auto_attribs=True, eq=False)
class SafeHashWrapper:
obj: Any

def __eq__(self, other) -> Any:
try:
res = self.obj == other
bool(res)
return res
except Exception:
return id(self.obj) == id(other)

def __hash__(self) -> Any:
if isinstance(self.obj, Hashable):
return hash(self.obj)
return hash(id(self.obj))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some suggestions here (I haven't checked that they all work):

Suggested change
@attr.s(auto_attribs=True, eq=False)
class SafeHashWrapper:
obj: Any
def __eq__(self, other) -> Any:
try:
res = self.obj == other
bool(res)
return res
except Exception:
return id(self.obj) == id(other)
def __hash__(self) -> Any:
if isinstance(self.obj, Hashable):
return hash(self.obj)
return hash(id(self.obj))
@attr.s(auto_attribs=True, eq=False, slots=True)
class SafeHashWrapper:
obj: object
def __eq__(self, other: object) -> bool:
try:
res = self.obj == other
return bool(res)
except Exception:
return id(self.obj) == id(other)
def __hash__(self) -> int:
if isinstance(self.obj, Hashable):
return hash(self.obj)
return hash(id(self.obj))

Copy link
Contributor Author

@haxtibal haxtibal Dec 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank's, applied all except obj: object, because it makes mypy think obj is always hashable and therefore we have an unreachable statement in SafeHashWrapper.__hash__.

I think the type constraint I'd like to express for obj is "type supports identity" and "type is not NoneType". But I don't know a way how, do you? EDIT: See python/mypy#11799.

A bit more OT, if isinstance(object(), Hashable) and isinstance(dict(), object) are True, doesn't isinstance(dict(), Hashable) == False break Liskov?

src/_pytest/fixtures.py Outdated Show resolved Hide resolved
testing/python/fixtures.py Show resolved Hide resolved
src/_pytest/fixtures.py Outdated Show resolved Hide resolved
@@ -1312,6 +1312,36 @@ def test2(no_eq):
result = pytester.runpytest()
result.stdout.fnmatch_lines(["*4 passed*"])

def test_reorder_indirect(self, pytester: Pytester) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A docstring with a reference to the PR (since there is no issue) would be good.

BTW, if you explicitly want to test the reordering here and not the runtime behavior, I suggest running the pytester with --collect-only --quiet which just spits the test order, and checking that, instead of doing prints and all that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicitly want to test the reordering here

Afaik pytest doesn't give guarantees about absolute test order, but only about certain constraints. So I intended to assert for the number of setups/teardowns to be minimal, not for a specific sequence. I think that's only possible if we actually let the fixture requests happen. They happen after collect.

But --collect-only could still be good for kind of unit testing the reorder function, would such an additional test be useful?

@haxtibal
Copy link
Contributor Author

haxtibal commented Dec 8, 2021

Thanks for the review @nicoddemus and @bluetech

figure out why the param index was even used in the first place

0cfd873 and #244 give some insight. Seems we switched from value to index in 2013 because unhashable params caused failure. This PR works around the unhashable problem by falling back to id() via the SafeHashWrapper.

some params might be large (like large numpy arrays and what not) and comparing them might be slow.

Agreed. Comparing here means hashing, but that could be slow too. It's probably not too bad, but still... A way out could be to consider a user provided id, instead of calculating the hash (such a user id would also be better than the id() fallback for unhashable types). Maybe even pytest.param(id="use_this_one_for_reorder") could be good enough? Give me some time, I'll evaluate that.

numpy arrays

They're special, as they're unhashable and won't compare to bool. It's a case where the SafeHashWrapper falls back to id().

@bluetech
Copy link
Member

bluetech commented Dec 8, 2021

0cfd873 and #244 give some insight.

Indeed, knowing it was originally value based and only switched to indexes for the hashability reassures me that there's not something we're missing.

@haxtibal haxtibal force-pushed the bugfix/reorder_indirect_params branch from 5a50852 to dfc773e Compare December 9, 2021 22:38
@haxtibal
Copy link
Contributor Author

haxtibal commented Dec 9, 2021

Maybe even pytest.param(id="use_this_one_for_reorder") could be good enough?

They could :) dfc773e demonstrates how. Not yet code complete, but all tests pass. Just wanted to hear your opinion about the approach early.

The benefit should be obvious from this example:

@pytest.mark.parametrize("fix", [1, pytest.param({"data": 2}, id="2")], indirect=True)
def test1(fix):
    pass

@pytest.mark.parametrize("fix", [pytest.param({"data": 2}, id="2"), 1], indirect=True)
def test2(fix):
    pass

Instead of subtle surprises triggered by implicit fallback to identity compare, a user can now explicitly control which
parameters are "the same". The id arg seems like the natural choice for that purpose. Otoh, the changes are less
trivial compared to the first commit, so I can understand if you don't like it.

How shall we proceed from here?

EDIT: @nicoddemus Sorry, just realized you've already approved, and my experimental commit totally invalidates that approve. Now I reverted this PR to the conservative fix, and created a new branch feature/param_id_is_key for the experimental one.

@haxtibal haxtibal force-pushed the bugfix/reorder_indirect_params branch from dfc773e to 2fdbc3e Compare December 10, 2021 12:13
@bluetech
Copy link
Member

@haxtibal Using ids sounds interesting. I'm not sure if it can work, but if does, it would be an elegant solution to the problem.

Your commit is hard to understand directly (because the underlying code is not the prettiest). What would be great is if you could:

  1. Open a separate PR for the id approach, based directly on main.
  2. Add comments (can be code comments or github comments) explaining what each non-trivial change is doing.

haxtibal pushed a commit to haxtibal/pytest that referenced this pull request Dec 16, 2021
Add a test to assert pytest-dev#8914 is fixed. The test assures that both reordering
and caching work as intended, and demonstrates how one can use parameter
ids to decide about equality.

Adapting issue_519.checked_order is not cheating. See our discussion at
pytest-dev#9350 (review)
@haxtibal
Copy link
Contributor Author

@bluetech

  1. Open a separate PR for the id approach, based directly on main.

Done, see #9420. It's now basically a combination of using parameter ids from API plus a fallback to the behavior introduced in #9350.

  1. Add comments (can be code comments or github comments) explaining what each non-trivial change is doing.

Did it as series of smaller commits, each with reasoning in commit message, hope that helps a bit.

if it can work

All tests pass, "can work" is "yes" until proven otherwise😏

@nicoddemus nicoddemus self-requested a review December 17, 2021 10:46
haxtibal pushed a commit to haxtibal/pytest that referenced this pull request Dec 18, 2021
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering
and caching work as intended, and demonstrates how one can use parameter
ids to decide about equality.

Adapting issue_519.checked_order is not cheating. See our discussion at
pytest-dev#9350 (review)
reorder_items groups tests so that each group can share indirectly
parameterized fixture instances. This optimizes for fewer fixture
setup/teardown cycles. Prior to this commit, grouping considered an
parameter's index, not the parameter value itself, to determine whether
a parameter is "the same" and therefore can be shared.

Relying on indexes is fast, however only works when there's exactly one
parameter list for one fixture. If we provide multiple (possibly
different) lists for the same fixture, e.g. by decorating test items,
one index can no longer refer to "the same" parameter. In order to still
be able to group for fixture reuse, we have to group by parameter value.

Caution: The value ends up inside the key of another dict, and therefore
must be hashable. This was true for indexes, but no longer is guaranteed
for user provided values. A user may e.g. provide dicts or numpy arrays.
The SafeHashWrapper ensures a fallback to id() in such a case.

Fixes pytest-dev#8914.
@haxtibal haxtibal force-pushed the bugfix/reorder_indirect_params branch from 2fdbc3e to 4a314bf Compare December 19, 2021 13:47
haxtibal pushed a commit to haxtibal/pytest that referenced this pull request Jan 23, 2022
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering
and caching work as intended, and demonstrates how one can use parameter
ids to decide about equality.

Adapting issue_519.checked_order is not cheating. See our discussion at
pytest-dev#9350 (review)
haxtibal pushed a commit to haxtibal/pytest that referenced this pull request Jan 23, 2022
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering
and caching work as intended, and demonstrates how one can use parameter
ids to decide about equality.

Adapting issue_519.checked_order is not cheating. See our discussion at
pytest-dev#9350 (review)
haxtibal pushed a commit to haxtibal/pytest that referenced this pull request Jan 23, 2022
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering
and caching work as intended, and demonstrates how one can use parameter
ids to decide about equality.

Adapting issue_519.checked_order is not cheating. See our discussion at
pytest-dev#9350 (review)
haxtibal pushed a commit to haxtibal/pytest that referenced this pull request Jan 23, 2022
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering
and caching work as intended, and demonstrates how one can use parameter
ids to decide about equality.

Adapting issue_519.checked_order is not cheating. See our discussion at
pytest-dev#9350 (review)
haxtibal pushed a commit to haxtibal/pytest that referenced this pull request Jan 26, 2022
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering
and caching work as intended, and demonstrates how one can use parameter
ids to decide about equality.

Adapting issue_519.checked_order is not cheating. See our discussion at
pytest-dev#9350 (review)
bluetech pushed a commit to haxtibal/pytest that referenced this pull request Jan 27, 2022
Add tests to assert pytest-dev#8914 is fixed. Tests assures that both reordering
and caching work as intended, and demonstrates how one can use parameter
ids to decide about equality.

Adapting issue_519.checked_order is not cheating. See our discussion at
pytest-dev#9350 (review)
@obestwalter
Copy link
Member

@haxtibal it would be a pity to loose the effort that already went into this, so trying to get the ball rolling again here. #9420 is the preferred approach, so I think this should be closed and maybe we can bring that one home then, right?

@obestwalter obestwalter added the status: needs information reporter needs to provide more information; can be closed after 2 or more weeks of inactivity label Jun 20, 2024
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided (automation) changelog entry is part of PR label Jun 20, 2024
@haxtibal
Copy link
Contributor Author

#9420 is the preferred approach, so I think this should be closed and maybe we can bring that one home then, right?

Agreed, let's close this in favor of #9420.

The latter will bother maintainers. There's potential of breaking use cases or user expectations and I'm not in a position to foresee them all. I'd help where I can, but it's your turn to signal what's next.

@haxtibal haxtibal closed this Jun 23, 2024
@obestwalter obestwalter removed the status: needs information reporter needs to provide more information; can be closed after 2 or more weeks of inactivity label Jun 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:chronographer:provided (automation) changelog entry is part of PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Indirect session fixture reordering not working as expected
4 participants