Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a way to get more frequent aligned runs #678

Open
nt1m opened this issue Jul 22, 2024 · 7 comments
Open

Find a way to get more frequent aligned runs #678

nt1m opened this issue Jul 22, 2024 · 7 comments

Comments

@nt1m
Copy link
Member

nt1m commented Jul 22, 2024

The experimental dashboard has had no aligned run from Jul 18th to Jul 22th, the dashboard is still stuck on Jul 18th.

This is worse than last year because of the introduction of Edge to the dashboard. Now the dashboard requires 4 aligned runs as opposed to 3. Often, we get a run with Edge being aligned with FF & Chrome, or Safari being aligned, but rarely both.

It would be nice to solve this issue. If we could somehow schedule Edge runs in sync with Safari, that would be nice!

Related suggestion from James: web-platform-tests/wpt.fyi#3689

@web-platform-tests/admins

@jgraham
Copy link
Contributor

jgraham commented Jul 22, 2024

Edge and Safari are supposed to be aligned I think. The main problem currently is that Azure keeps losing runners and we don't have any way to automatically recover. If that affects both Edge and Safari then the chances of getting an aligned run are diminished (there's also a problem with Firefox where one test sometimes causes a harness error we can't recover from; needless to say it's difficult to reproduce locally).

@foolip
Copy link
Member

foolip commented Jul 23, 2024

I haven't looked at the trends recently, but the way I would approach this problem is to first treat the epochs/three_hourly branch as the target, since that's the most often we could align without increasing the rate of Edge and Safari runs.

Then, look at the the percentage of the three hourly commits (./wpt rev-list --epoch 3h --max-count 100) that have results for each browser. Then, focus on whichever has the least reliable results, asking someone from that browser vendor to drive the effort if possible.

@foolip
Copy link
Member

foolip commented Jul 23, 2024

I realized it's possible to answer the question on the command line, so here's the output of ./wpt rev-list --epoch 3h --max-count 100 | while read sha; do curl --silent "https://wpt.fyi/api/runs?label=master&label=experimental&sha=$sha" | jq '.[] | .browser_name'; done | sort | uniq -c:

91 "chrome"
39 "edge"
91 "firefox"
11 "safari"

The last of the 100 commits is from June 18, so a bit over a month.

@gsnedders is web-platform-tests/wpt#47181 the plan to improve the reliability of Safari runs, or just an experiment?

@dandclark is there someone who can help us look into the reliability of Edge runs on Azure Pipelines?

@stubbornella
Copy link

Do we need to run all of the WPT suite of tests more frequently or could we run just the tests included in Interop 2024 at a higher rate? A bandaid solution at best, but maybe one that can help us while we work towards something longer term.

@nt1m
Copy link
Member Author

nt1m commented Aug 8, 2024

@foolip Can we show the latest non-aligned runs for each browsers? and also link to those. I think it would give a more accurate view of things.

@foolip
Copy link
Member

foolip commented Aug 8, 2024

@stubbornella @nt1m I think those are both ideas worth exploring. I believe that the scoring rewrite that @jgraham did no longer requires aligned runs, and if that's correct it's probably best to prioritize review and deploy of that.

However, I'll be on parental leave until February, so I'll have to defer to the rest of the group on, well, everything 😄

@jgraham
Copy link
Contributor

jgraham commented Aug 10, 2024

https://github.com/jgraham/interop-results/tree/main/2024/results/revisions has interop results for all the revisions for which we have wpt results.

The main remaining issue is getting UI work so that you can switch scores on the dashboard. @DanielRyanSmith has done all the UI so far, but I imagine that if someone else can contribute patches that will speed the process along a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants