pytest 8.0 sorting tests with multiple parameterization is broken #12008

ShurikMen · 2024-02-19T04:14:55Z

Сontinue #11976

The solution from #11976 did not solve the problem with sorting tests with multiple parameters

Tests

import pytest


@pytest.mark.parametrize('proto', ['serial', 'telnet', 'ssh'], scope='class')
@pytest.mark.parametrize('unit', [1, 2, 3], scope='class')
class TestA:
    def test_one(self, proto, unit):
        pass

    def test_two(self, proto, unit):
        pass

Collecting items with pytest 7.4.4

11:07 $ pytest --co
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.11.6, pytest-7.4.4, pluggy-1.4.0 -- /home/lolik/.cache/pypoetry/virtualenvs/pytest_strain-FQAfhJsh-py3.11/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.11.6', 'Platform': 'Linux-6.7.4-arch1-1-x86_64-with-glibc2.39', 'Packages': {'pytest': '7.4.4', 'pluggy': '1.4.0'}, 'Plugins': {'metadata': '3.1.0'}, 'GIT_BRANCH': 'master'}
rootdir: /home/lolik/Projects/straing/pytest
configfile: pytest.ini
plugins: metadata-3.1.0
collected 18 items                                                                                                                                                        

<Package tests>
  <Module test_asd.py>
    <Class TestA>
      <Function test_one[1-serial]>
      <Function test_two[1-serial]>
      <Function test_one[1-telnet]>
      <Function test_two[1-telnet]>
      <Function test_one[1-ssh]>
      <Function test_two[1-ssh]>
      <Function test_one[2-serial]>
      <Function test_two[2-serial]>
      <Function test_one[2-telnet]>
      <Function test_two[2-telnet]>
      <Function test_one[2-ssh]>
      <Function test_two[2-ssh]>
      <Function test_one[3-serial]>
      <Function test_two[3-serial]>
      <Function test_one[3-telnet]>
      <Function test_two[3-telnet]>
      <Function test_one[3-ssh]>
      <Function test_two[3-ssh]>

======================================================================= 18 tests collected in 0.01s =======================================================================

Collecting items with pytest 8.0.1

11:08 $ pytest --co
=========================================================================== test session starts ===========================================================================
platform linux -- Python 3.11.6, pytest-8.0.1, pluggy-1.4.0 -- /home/lolik/.cache/pypoetry/virtualenvs/pytest_strain-FQAfhJsh-py3.11/bin/python
cachedir: .pytest_cache
metadata: {'Python': '3.11.6', 'Platform': 'Linux-6.7.4-arch1-1-x86_64-with-glibc2.39', 'Packages': {'pytest': '8.0.1', 'pluggy': '1.4.0'}, 'Plugins': {'metadata': '3.1.0'}, 'GIT_BRANCH': 'master'}
rootdir: /home/lolik/Projects/straing/pytest
configfile: pytest.ini
plugins: metadata-3.1.0
collected 18 items                                                                                                                                                        

<Dir pytest>
  <Package tests>
    <Module test_asd.py>
      <Class TestA>
        <Function test_one[1-serial]>
        <Function test_two[1-serial]>
        <Function test_one[2-serial]>
        <Function test_two[2-serial]>
        <Function test_one[2-telnet]>
        <Function test_two[2-telnet]>
        <Function test_one[1-telnet]>
        <Function test_two[1-telnet]>
        <Function test_one[3-telnet]>
        <Function test_two[3-telnet]>
        <Function test_one[3-serial]>
        <Function test_two[3-serial]>
        <Function test_one[3-ssh]>
        <Function test_two[3-ssh]>
        <Function test_one[2-ssh]>
        <Function test_two[2-ssh]>
        <Function test_one[1-ssh]>
        <Function test_two[1-ssh]>

======================================================================= 18 tests collected in 0.01s =======================================================================

The text was updated successfully, but these errors were encountered:

bluetech · 2024-02-19T06:57:43Z

Thanks, this does look wrong at first glance, I will look into it. Bisected to 09b7873 (PR #11220).

bluetech · 2024-02-20T21:32:14Z

Minimized example:

@pytest.mark.parametrize('proto', ['a', 'b'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class Test:
    def test(self, proto, unit):
        pass

The items are reordered by reorder_items() in fixture.py (surely the most inscrutable function in all of pytest), whose aim is to minimize fixture setups and teardowns. In this respect it achieves its goal:

Setup plan in pytest 7, has 6 setups/teardowns:

      SETUP    C proto['a']
      SETUP    C unit[1]
        x.py::Test::test[1-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[1-b] (fixtures used: proto, unit)
      TEARDOWN C proto['b']
      SETUP    C proto['a']
      TEARDOWN C unit[1]
      SETUP    C unit[2]
        x.py::Test::test[2-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[2-b] (fixtures used: proto, unit)
      TEARDOWN C unit[2]
      TEARDOWN C proto['b']

in pytest 8, has 5 setups/teardowns:

      SETUP    C proto['a']
      SETUP    C unit[1]
        x.py::Test::test[1-a] (fixtures used: proto, unit)
      TEARDOWN C unit[1]
      SETUP    C unit[2]
        x.py::Test::test[2-a] (fixtures used: proto, unit)
      TEARDOWN C proto['a']
      SETUP    C proto['b']
        x.py::Test::test[2-b] (fixtures used: proto, unit)
      TEARDOWN C unit[2]
      SETUP    C unit[1]
        x.py::Test::test[1-b] (fixtures used: proto, unit)
      TEARDOWN C unit[1]
      TEARDOWN C proto['b']

The way @pytest.mark.parametrize works behind the scenes is that it basically desugars to this:

import pytest

@pytest.fixture(params=[1, 2], scope='class')
def unit(request):
    return request.param

@pytest.fixture(params=['a', 'b'], scope='class')
def proto(request):
    return request.param

class Test:
    def test(self, unit, proto):
        pass

In this framing, it is clearer why we want to minimize the setups/teardowns -- the fixtures can do real work, not just return the param. Both pytest 7 and 8 reorder this version to have 5 setups/teardowns.

I can't say why before 09b7873 the reordering didn't happen, and if this was intentional or accidental. Will look into it more.

RonnyPfannschmidt · 2024-02-21T11:05:04Z

@bluetech i believe the change by @sadra-barikbin fixed a bug in pytest wrt scope ordering

it dates back to #519 and the fix corrects the test scope ordering based on values and fixture values and scopes

for further validation we might want to add a variant of the example for #519 that mixes class scope in (in which case the old order is actually correct

but the basic gist to my current understanding is that pytest no longer has ordering differences between sugared and de-sugared parameterize as now parameterize is expressed in pseudo fixtures all the way

bluetech · 2024-02-22T07:57:18Z

Technical details of the change:

Before 09b7873, Metafunc will generate CallSpec's with separate params (indirect parametrizations, i.e. through fixtures) and funcargs (direct parametrizations), and the desugaring of direct to indirect (i.e.g funcargs to params) happened as a separate step after pytest_generate_functions: https://github.com/pytest-dev/pytest/blob/7.4.4/src/_pytest/fixtures.py#L156.

After 09b7873, Metafunc handles the desugaring itself, and there is no more funcargs.

All of this happens before reorder_items anyway, so why does it affect the item ordering? The difference is the param_index that the desugaring assigns to the (desguared) direct params. Before the code was this:

https://github.com/pytest-dev/pytest/blob/7.4.4/src/_pytest/fixtures.py#L168-L181

The important bit is callspec.indices[argname] = len(arg2params_list) - this basically assigns a fresh param index across all callspecs. This results in the following arg keys in reorder_items:

('proto', 0, <Function test[1-a]>)
('proto', 1, <Function test[1-b]>)
('proto', 2, <Function test[2-a]>)
('proto', 3, <Function test[2-b]>)
('unit',  0, <Function test[1-a]>)
('unit',  1, <Function test[1-b]>)
('unit',  2, <Function test[2-a]>)
('unit',  3, <Function test[2-b]>)

After, there is no special handling of the param indexes of direct params, they are handled same as parametrized fixtures. This results in the following arg keys:

('proto', 0, <Function test[1-a]>)
('proto', 0, <Function test[2-a]>)
('proto', 1, <Function test[1-b]>)
('proto', 1, <Function test[2-b]>)
('unit',  0, <Function test[1-a]>)
('unit',  0, <Function test[1-b]>)
('unit',  1, <Function test[2-a]>)
('unit',  1, <Function test[2-b]>)

Rephrasing the above in a way that may be clearer:

Before, indexes for direct params were assigned in a sequential manner per argname after all parametrizations are exploded, so we have a table of argname, item (= callspec) and we assign the param index:

argname	item	param index
proto	test[1-a]	0
proto	test[1-b]	1
proto	test[2-a]	2
proto	test[2-b]	3
unit	test[1-a]	0
unit	test[1-b]	1
unit	test[2-a]	2
unit	test[2-b]	3

After, the param indexes are assigned as they would for the desugaring I gave above:

import pytest

# param indexes:        0  1
@pytest.fixture(params=[1, 2], scope='class')
def unit(request):
    return request.param

# param indexes:         0    1
@pytest.fixture(params=['a', 'b'], scope='class')
def proto(request):
    return request.param

bluetech · 2024-02-22T08:06:21Z

@ShurikMen I wonder, is the example you gave realistic or do you use indirect params maybe? I'm mainly curious why you're setting scope='class' on your parametrizes when the parameters are simple values. With function scope the ordering is as you expect.

ShurikMen · 2024-02-22T08:32:59Z

@ShurikMen I wonder, is the example you gave realistic or do you use indirect params maybe? I'm mainly curious why you're setting scope='class' on your parametrizes when the parameters are simple values.

This is one of the simplest examples that are actually used in my projects. There are many examples with an even larger set of parameters including "mixed" scopes (class/function).

With function scope the ordering is as you expect.

Not quite like that. Changing the order causes unwanted fixture calls. They are very expensive in terms of execution time.
It is important for me that tests with the same set of parameters are called sequentially.

With scope class (same optimal):

import pytest


@pytest.fixture(autouse=True, scope='class')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass

test_asd.py::TestA::test_one[1-serial] 
      SETUP    C unit[1]
      SETUP    C proto['serial']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[1-serial] 
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[1-telnet] 
      TEARDOWN C some_fix
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[1-telnet] 
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[2-serial] 
      TEARDOWN C some_fix
      TEARDOWN C unit[1]
      SETUP    C unit[2]
      TEARDOWN C proto['telnet']
      SETUP    C proto['serial']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[2-serial] 
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_one[2-telnet] 
      TEARDOWN C some_fix
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
      SETUP    C some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
test_asd.py::TestA::test_two[2-telnet] 
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
      TEARDOWN C some_fix
      TEARDOWN C proto['telnet']
      TEARDOWN C unit[2]

With scope function

import pytest


@pytest.fixture(autouse=True, scope='function')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='function')
@pytest.mark.parametrize('unit', [1, 2], scope='function')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass

test_asd.py::TestA::test_one[1-serial] 
        SETUP    F unit[1]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_one[1-telnet] 
        SETUP    F unit[1]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_one[2-serial] 
        SETUP    F unit[2]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_one[2-telnet] 
        SETUP    F unit[2]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_two[1-serial] 
        SETUP    F unit[1]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_two[1-telnet] 
        SETUP    F unit[1]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[1]
test_asd.py::TestA::test_two[2-serial] 
        SETUP    F unit[2]
        SETUP    F proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['serial']
        TEARDOWN F unit[2]
test_asd.py::TestA::test_two[2-telnet] 
        SETUP    F unit[2]
        SETUP    F proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
        TEARDOWN F proto['telnet']
        TEARDOWN F unit[2]

Class params with function fixture

import pytest


@pytest.fixture(autouse=True, scope='function')
def some_fix(unit, proto):
    yield


@pytest.mark.parametrize('proto', ['serial', 'telnet'], scope='class')
@pytest.mark.parametrize('unit', [1, 2], scope='class')
class TestA:
    def test_one(self, unit, proto):
        pass

    def test_two(self, unit, proto):
        pass

test_asd.py::TestA::test_one[1-serial] 
      SETUP    C unit[1]
      SETUP    C proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[1-serial] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[1-telnet] 
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[1-telnet] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[1-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[2-serial] 
      TEARDOWN C unit[1]
      SETUP    C unit[2]
      TEARDOWN C proto['telnet']
      SETUP    C proto['serial']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[2-serial] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-serial] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_one[2-telnet] 
      TEARDOWN C proto['serial']
      SETUP    C proto['telnet']
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_one[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
test_asd.py::TestA::test_two[2-telnet] 
        SETUP    F some_fix (fixtures used: proto, unit)
        test_asd.py::TestA::test_two[2-telnet] (fixtures used: proto, some_fix, unit)
        TEARDOWN F some_fix
      TEARDOWN C proto['telnet']
      TEARDOWN C unit[2]

Here I experimented a lot with various combinations of scope parameters and scope fixtures, including parameterization in fixtures (pytest.fixture(param=...)) to obtain the most optimal ways to perform fixtures and tests.

RonnyPfannschmidt · 2024-02-22T09:52:01Z

Currently when parameterize is taken for consideration, compound dependent fixtures are not

Id recommend having a single parameterset that creates the correct compound parameters to a single fixture so it will no longer be considered as independent parameters

ShurikMen · 2024-03-06T08:25:38Z

@bluetech take a look at #12082

sadra-barikbin · 2024-03-22T14:17:44Z

Here are the three approaches:

sadra-barikbin · 2024-03-23T23:25:57Z

Seems we have an appearance-efficiency trade-off, with @ShurikMen's suggestion and v7.4.4 approach yielding better appearance but having lower efficiency in setup-teardowns in comparison with v8.0.0.

The three approaches could also be compared in terms of robustness to parametrization varieties which @ShurikMen 's suggestion for example performs better than v7.4.4 and solves the first example of #11257 but fails on the example below which v8.0.0 solves, unless the user be cautious and swap the order of parametrizations on test1.

# Here `test1["b",0]` and `test2[3]` would wrongfully have a common fixturekey in @ShurikMen 's method.
@pytest.mark.parametrize("arg2", [0, 1, 2], scope='module')
@pytest.mark.parametrize("arg1", ["a", "b"], scope='module')
def test1(arg1, arg2):
    pass

@pytest.mark.parametrize("arg2", [0, 1, 2, 3], scope='module')
def test2(arg2):
    pass

Considering robustness alone, @ShurikMen method's gain over v7.4.4 seems remarkable but I'm not sure about that of v8.0.0 over @ShurikMen 's.

sadra-barikbin · 2024-03-24T00:00:56Z

Aside by the comments above,I'm for the notion of bug for current reordering in v8.0.0 as it has not been introduced by the intention to become more efficient which we discussed here about. It either has been an unnoticed side-effect of #11220 or something in my large initial PR, responsible for the first example of #11257 (which @ShurikMen 's method now solves as well). Sorry for inconvenience, mates!

ShurikMen · 2024-03-24T07:44:23Z

I have updated the pr. Unified indexing of parameters added through a marker and through fixtures. I think it's the right thing to do. Along the way, I corrected the tests related to parameterization through fixtures.

ShurikMen · 2024-05-07T03:15:18Z

@bluetech what about solving the problem on this issue?

In #11220, an unintended change in reordering was introduced by changing the way indices were assigned to direct params. This PR reverts that change and reduces #11220 changes to just refactors. After this PR we could safely decide on the solutions discussed in #12008, i.e. #12082 or the one initially introduced in #11220 . Fixes #12008 Co-authored-by: Bruno Oliveira <[email protected]> Co-authored-by: Bruno Oliveira <[email protected]>

bluetech added type: bug problem that needs to be addressed topic: parametrize related to @pytest.mark.parametrize type: regression indicates a problem that was introduced in a release which was working previously topic: fixtures anything involving fixtures directly or indirectly labels Feb 19, 2024

bluetech mentioned this issue Feb 22, 2024

fixtures: remove a no longer needed sort #12019

Merged

bluetech changed the title ~~pytest 8.0.1 sorting tests with multiple parameterization is broken~~ pytest 8.0 sorting tests with multiple parameterization is broken Feb 22, 2024

ShurikMen mentioned this issue Mar 6, 2024

change param_index if param is pseudofixturedef #12082

Open

sadra-barikbin mentioned this issue Jun 27, 2024

Revert the unintended change in tests reordering from #11220 #12542

Merged

RonnyPfannschmidt closed this as completed in #12542 Aug 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pytest 8.0 sorting tests with multiple parameterization is broken #12008

pytest 8.0 sorting tests with multiple parameterization is broken #12008

ShurikMen commented Feb 19, 2024

bluetech commented Feb 19, 2024

bluetech commented Feb 20, 2024

RonnyPfannschmidt commented Feb 21, 2024

bluetech commented Feb 22, 2024

bluetech commented Feb 22, 2024

ShurikMen commented Feb 22, 2024

RonnyPfannschmidt commented Feb 22, 2024

ShurikMen commented Mar 6, 2024

sadra-barikbin commented Mar 22, 2024 •

edited

Loading

sadra-barikbin commented Mar 23, 2024 •

edited

Loading

sadra-barikbin commented Mar 24, 2024 •

edited

Loading

ShurikMen commented Mar 24, 2024

ShurikMen commented May 7, 2024

pytest 8.0 sorting tests with multiple parameterization is broken #12008

pytest 8.0 sorting tests with multiple parameterization is broken #12008

Comments

ShurikMen commented Feb 19, 2024

bluetech commented Feb 19, 2024

bluetech commented Feb 20, 2024

RonnyPfannschmidt commented Feb 21, 2024

bluetech commented Feb 22, 2024

bluetech commented Feb 22, 2024

ShurikMen commented Feb 22, 2024

RonnyPfannschmidt commented Feb 22, 2024

ShurikMen commented Mar 6, 2024

sadra-barikbin commented Mar 22, 2024 • edited Loading

sadra-barikbin commented Mar 23, 2024 • edited Loading

sadra-barikbin commented Mar 24, 2024 • edited Loading

ShurikMen commented Mar 24, 2024

ShurikMen commented May 7, 2024

sadra-barikbin commented Mar 22, 2024 •

edited

Loading

sadra-barikbin commented Mar 23, 2024 •

edited

Loading

sadra-barikbin commented Mar 24, 2024 •

edited

Loading