Performance idea: `--sf`/`--slow-first` option to improve resource utilization #657

Zac-HD · 2021-05-04T06:29:15Z

Reading this blog post about Stripe's test runner made me think we should have a --slow-first option for xdist, and it seems that we don't yet 😅 The motivation for --slow-first is that fastest-tests-last is a great heuristic to reduce the duration at the end of a test run when some processes are done but others are still running - which can range from negligible to "several times longer than the rest of the run" (when e.g. I select mostly fast unit tests, plus a few slow integration tests which happen to run last).

IMO this should be lower priority than --last-failed, only reorder passing tests for --failed-first, and be incompatible with --new-first (existing flag docs). The main trick is to cache durations from the last run, and then order by the aggregate time for each loadscope (i.e. method, class, or file, depending on what we'll distribute - pytest-randomly is useful prior art).

The text was updated successfully, but these errors were encountered:

nicoddemus · 2021-05-05T12:02:18Z

That is indeed interesting.

One detail however is that this doesn't need to be implemented in xdist at all, any plugin which reorders tests using pytest_collection_modifyitems would work, as xdist would then see the reordered list, and schedule tests accordingly.

Zac-HD · 2021-05-05T13:21:55Z

One detail however is that this doesn't need to be implemented in xdist at all, any plugin which reorders tests using pytest_collection_modifyitems would work, as xdist would then see the reordered list, and schedule tests accordingly.

It could be implemented elsewhere, but the --slow-first ordering only improves performance if you're running tests in parallel, so I think xdist is the most sensible place for it. It could even make single-core performance worse, e.g. in combination with -x/--exit-first. For best results --slow-first also needs to know the current value of xdist's --dist argument.

For example, take a test suite with five 1s tests in file A, a single 3s test in file B, and two 3s tests in file C; and assume that we have two cores.

With --dist=load
- Currently, we'll have core1 run A1 A3 A5 C1=6s and core2 run A2 A4 B C2=8s
- --slow-first would have core1 run B C2 A4=7s and core2 run C1 A1 A2 A3 A5=7s (speedup!)
With --dist=loadfile
- Currently, we'll have core1 run A=5s and core2 run B C=9s
- --slow-first would have core1 run C=6s and core2 run A B=8s (speedup!)
--dist=each is of course equivalent to single-core, so no benefit from --slow-first

So on this toy model we get a 16% wall-clock speedup just from better task ordering!

In the real world, I have twelve cores and Hypothesis' 2500 cover tests take ~70s with the slowest ten tests taking 5-15s each; the 500 nocover tests take ~35s with the slowest ten taking 5-19s each (and yes we've taken the low-hanging perf fruit). Anecdotally, it's pretty obvious towards the end that things are slowing down and a few cores are idling, and I'd expect a similar 10%-20% wall-clock improvement.

Zac-HD · 2021-12-15T14:12:49Z

See also https://nedbatchelder.com/blog/202112/loadbalanced_xdist.html by @nedbat 😁

pharmpy-dev-123 · 2022-11-16T19:41:23Z

this blog post about Stripe's test runner

That link is 404 now. The original title was Running three hours of Ruby tests in under three minutes I believe?!

Zac-HD · 2022-11-16T19:51:20Z

Yep, here: https://web.archive.org/web/20210703075513/https://stripe.com/blog/distributed-ruby-testing

klimkin · 2022-12-11T02:42:44Z

My attempt to implement sorting using the previous run duration:
https://github.com/klimkin/pytest-slowest-first

nicoddemus · 2022-12-12T18:54:00Z

Awesome @klimkin, thanks for sharing!

albertino87 · 2024-03-11T15:18:27Z

is it possible to implement the sorting defined by pytest.mark.order(n) (from the pytest-order library)?
it's not always possible to have the previous run when running on new CI machines each time

braingram · 2024-09-19T17:05:06Z

I was looking for this feature and at using pytest_collection_modifyitems to work around this. Doesn't #778 prevent custom ordering if using loadscope (which for my application we have to use for efficient fixture reuse)?

Zac-HD added the enhancement label May 4, 2021

azmeuk mentioned this issue Aug 17, 2021

Worker startup performances #695

Open

pharmpy-dev-123 mentioned this issue Nov 16, 2022

Optimize read_results pharmpy/pharmpy#1138

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance idea: `--sf`/`--slow-first` option to improve resource utilization #657

Performance idea: `--sf`/`--slow-first` option to improve resource utilization #657

Zac-HD commented May 4, 2021

nicoddemus commented May 5, 2021

Zac-HD commented May 5, 2021 •

edited

Loading

Zac-HD commented Dec 15, 2021

pharmpy-dev-123 commented Nov 16, 2022

Zac-HD commented Nov 16, 2022

klimkin commented Dec 11, 2022

nicoddemus commented Dec 12, 2022

albertino87 commented Mar 11, 2024 •

edited

Loading

braingram commented Sep 19, 2024

Performance idea: --sf/--slow-first option to improve resource utilization #657

Performance idea: --sf/--slow-first option to improve resource utilization #657

Comments

Zac-HD commented May 4, 2021

nicoddemus commented May 5, 2021

Zac-HD commented May 5, 2021 • edited Loading

Zac-HD commented Dec 15, 2021

pharmpy-dev-123 commented Nov 16, 2022

Zac-HD commented Nov 16, 2022

klimkin commented Dec 11, 2022

nicoddemus commented Dec 12, 2022

albertino87 commented Mar 11, 2024 • edited Loading

braingram commented Sep 19, 2024

Performance idea: `--sf`/`--slow-first` option to improve resource utilization #657

Performance idea: `--sf`/`--slow-first` option to improve resource utilization #657

Zac-HD commented May 5, 2021 •

edited

Loading

albertino87 commented Mar 11, 2024 •

edited

Loading