Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Randomize Framework order in runs #8830

Open
p8 opened this issue Apr 1, 2024 · 13 comments
Open

Randomize Framework order in runs #8830

p8 opened this issue Apr 1, 2024 · 13 comments

Comments

@p8
Copy link
Contributor

p8 commented Apr 1, 2024

Currently the benchmarks are run in the same order everytime.
Sometimes a run fails after a number of frameworks were benchmarked, or the run is restarted.
This causes the frameworks starting with a to have more test runs than frameworks starting with z.
If the order could be randomized the number of runs would be better distributed.

@fakeshadow
Copy link
Contributor

As someone maintaining a benchmark starts with x I feel this.

That said IMO a fair way of handling order is to prioritize benches with most recent changes. As for benches that haven't changed for a while I guess the maintainers would care less about continuous run result.

@joanhey
Copy link
Contributor

joanhey commented Apr 1, 2024

I think that is enough to start one run with a and the next with z.
And perhaps we still see some differences in the results.

This run and the next, change the servers, databases, ... so the change it's for all frameworks.
Don't depend from the changes in the frameworks.
A mature framework need less changes than a young one. Still we can bench in local to test small changes.

@itrofimow
Copy link
Contributor

itrofimow commented Apr 2, 2024

What I maintain starts with u, so I'm heavily biased here, but I would also appreciate this change being implemented.

My concern is not about failures or restarts, as they usually don't happen that often when the environment is stable, but rather about a feedback latency:
I mostly use TFB as a measurement tool (and a big shout-out to TE crew for providing that tool), and given a hypothetical performance drop in the ongoing run, I'm left with approx. a day to squeeze a potential fix into the next measurement, and a failure to do so would lead to a feedback latency of two full weeks (every run is approx. a week).
Moreover, any dependency bump I do is at least a week (an almost full run) in terms of feedback latency, and 1.5 weeks on average.

Flipping the order between runs (or FWIW randomizing it) would significantly reduce these latencies for me.

@joanhey
Copy link
Contributor

joanhey commented Apr 2, 2024

The frameworks that stay in the middle have ~3 days to make changes.
It's the same if the bench begin with a or in reverse order.
The problem is the frameworks that are the last in the run.
Please don't randomize, now we almost know when are the result for our framework.
But we need to flip the order in any new run !!

@p8
Copy link
Contributor Author

p8 commented Apr 2, 2024

Flipping the order each time makes sense to me.

@NateBrady23
Copy link
Member

I like the idea of flipping the order. I'm just getting back from vacation and catching up on a bunch of stuff. Let's get the environment stable, and then I think this is easy to do. Will leave this open until we get it in.

@joanhey
Copy link
Contributor

joanhey commented May 2, 2024

After finish the last full run, the next run did not flip the order.

@volyrique
Copy link
Contributor

That's because the tfb-startup.sh script runs tfb-shutdown.sh on startup; the latter is responsible for flipping the order. Is changing the order only after an unsuccessful run by design?

@p8
Copy link
Contributor Author

p8 commented May 3, 2024

I think the following run was reversed: https://tfb-status.techempower.com/results/3c2e9871-9c2a-4ff3-bc31-620f65da4e74. The “last framework” tested is incorrect though.

@NateBrady23
Copy link
Member

That's because the tfb-startup.sh script runs tfb-shutdown.sh on startup; the latter is responsible for flipping the order. Is changing the order only after an unsuccessful run by design?

No, I forgot that we actually run the shutdown script twice after a successful run because it's being called from the startup script as well. The design was supposed to be the exact opposite. I'll have to move it to the startup script and it will just reverse every time a run starts.

@volyrique
Copy link
Contributor

@NateBrady23 It looks like now the opposite thing is happening - the order is always reversed, i.e. the implementations starting with Z run first.

@akupiec
Copy link
Contributor

akupiec commented Aug 28, 2024

How about adding an option to run tests in the order of their last execution time, from fastest to slowest? 😈

@volyrique
Copy link
Contributor

I am pretty sure that that approach would end up being effectively the same as running them in alphabetical order (or in random order at best), at the cost of significant implementation complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants