Update end to end test sharding on CI, make Playwright tests faster, Puppeteer tests slower #50362

talldan · 2023-05-05T06:41:12Z

What?

As more end to end tests are migrated to Playwright (and new tests are added), those jobs are naturally taking longer. (around 30-35 minutes)

Similarly as there are fewer tests running using Puppeteer those are becoming quicker. (around 10-15 minutes)

This adjusts the jobs so that there's more sharding for Playwright, now tests are split into 4 sets. Puppeteer tests are now reduced to 3 sets.

By my calculations, it should make both Puppeteer and Playwright tests complete in around 20 minutes, so they'll both be quicker than Performance tests again.

talldan · 2023-05-05T07:35:10Z

I wonder why Playwright - 4 is taking so much longer.

It might be good to identify which particular tests are slow and try to improve that.

kevin940726

Makes sense to me! Thanks!

Mamaduka · 2023-06-14T05:50:18Z

Are there any blockers for this PR?

talldan · 2023-06-14T06:00:10Z

@Mamaduka I think it's ok. I'd just noticed that one of the shards was still taking a long time, so I wondered why they weren't more evenly balanced. I couldn't spot any obvious reasons.

I need to revert 14b78c3, then it should be ok to merge, at which point the required checks will have to be updated in github.

This reverts commit 14b78c3.

Mamaduka · 2023-06-14T06:05:48Z

I guess a few tests that end up in 4th shard are a bit slow. Maybe in the future, we can monitor/track slow tests and improve as needed.

@jeryj did that recently in #50681.

talldan · 2023-06-14T06:13:33Z

I guess a few tests that end up in 4th shard are a bit slow. Maybe in the future, we can monitor/track slow tests and improve as needed.

In that respect it's probably good to ensure the test files don't get too big, and break them up across multiple files when they do. That would help the sharding work better, as one shard wouldn't spend too long on any individual file.

I personally don't really like the idea of writing longer test cases as they're hard to read and reason about and difficult to refactor.

Mamaduka · 2023-06-14T06:17:35Z

I personally don't really like the idea of writing longer test cases as they're hard to read and reason about and difficult to refactor.

It depends on the case, like everything, I guess 😄 But yes, I'm also in the camp "write small and atomic tests".

talldan · 2023-06-14T06:34:08Z

Yeah, still seeing the same where Playwright 4 is taking much longer—10 minutes longer than Playwright 1.

It's noticeable that one of the tests outputs a lot of warning messages (I think it's one of the tests that runs in firefox), so it could be that this makes it run more slowly.

github-actions · 2023-06-14T06:35:33Z

Flaky tests detected in d8ef29d.
Some tests passed with failed attempts. The failures may not be related to this commit but are still reported for visibility. See the documentation for more information.

🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/5263616230
📝 Reported issues:

[Flaky Test] Pick the additional variation in the inserted Columns block #47057 in specs/editor/plugins/block-variations.test.js

kevin940726 · 2023-06-14T06:36:22Z

I think sharding splits by test files but not test cases. So unless we split a big test file into multiple smaller ones, writing multiple shorter test cases would probably just make it slower 🤔 .

If that ever becomes a bottleneck then perhaps we can figure out a quick convention so that others can follow?

kevin940726 · 2023-06-14T06:38:43Z

It's noticeable that one of the tests outputs a lot of warning messages (I think it's one of the tests that runs in firefox), so it could be that this makes it run more slowly.

I think that's the case here.

A possible follow-up would be to split firefox and webkit testing to a different shard.

talldan · 2023-06-14T06:52:23Z

So unless we split a big test file into multiple smaller ones, writing multiple shorter test cases would probably just make it slower 🤔 .

Yeah, I'm mainly advocating for breaking things down into shorter test files where it's logical (e.g. when a test has separate describe blocks with distinct setup and teardown).

The comment about shorter test cases was more about maintainability. I realise that will make things slower, but I think there's a balance to be had.

Mamaduka · 2023-06-14T06:56:28Z

I think we're onto something here. When listing the tests run by the last shard, most of the tests running in two browsers end up there.

It makes sense that tests running in multiple browsers will take longer than running in one.

You can check this by following the command(s):

# on trunk
npm run test:e2e:playwright -- --shard=2/2 --list

# on this branch
npm run test:e2e:playwright -- --shard=4/4 --list

jeryj · 2023-06-14T15:13:30Z

I personally don't really like the idea of writing longer test cases as they're hard to read and reason about and difficult to refactor.

I completely agree for unit tests. I've found for e2e tests that they're a lot easier to work with when combined in a sensible way because they run so much faster and are more effective at catching bugs when they replicate a real user flow. This also helps document the expected behavior in a way that I think is easier to understand. The VS Code Playwright tooling around it is also quite good for stepping through the e2e tests for debugging.

There's definitely a careful balance to which tests make sense to combine together to still keep it isolated enough to understand, debug, and not be flaky. I totally hear that point. Also, I don't mean to make this into a debate around e2e test length, but wanted to share from the perspective of someone who was wholeheartedly in the "keep everything isolated" camp and have shifted my philosophy around it since working with e2e tests.

talldan · 2023-06-23T03:36:25Z

I'll merge this as tests are failing in trunk (though just fixed by #51826) and will need a rebase anyway, so it seems like a good opportunity.

…Puppeteer tests slower (WordPress#50362) * Increase playwright shards to 4, decrease puppeteer to 3 * Temporarily enable list reporter * Revert "Temporarily enable list reporter" This reverts commit 14b78c3.

talldan added [Type] Task Issues or PRs that have been broken down into an individual action to take [Type] Automated Testing Testing infrastructure changes impacting the execution of end-to-end (E2E) and/or unit tests. labels May 5, 2023

talldan requested review from gziolo and kevin940726 May 5, 2023 06:41

talldan self-assigned this May 5, 2023

kevin940726 approved these changes May 5, 2023

View reviewed changes

talldan added 3 commits June 14, 2023 14:02

Increase playwright shards to 4, decrease puppeteer to 3

e6d44b3

Temporarily enable list reporter

4b5b9df

Revert "Temporarily enable list reporter"

d8ef29d

This reverts commit 14b78c3.

talldan force-pushed the update/e2e-test-sharding branch from 14b78c3 to d8ef29d Compare June 14, 2023 06:03

talldan merged commit 5fa3f81 into trunk Jun 23, 2023

talldan deleted the update/e2e-test-sharding branch June 23, 2023 03:36

github-actions bot added this to the Gutenberg 16.1 milestone Jun 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update end to end test sharding on CI, make Playwright tests faster, Puppeteer tests slower #50362

Update end to end test sharding on CI, make Playwright tests faster, Puppeteer tests slower #50362

talldan commented May 5, 2023 •

edited

Loading

talldan commented May 5, 2023

kevin940726 left a comment

Mamaduka commented Jun 14, 2023

talldan commented Jun 14, 2023

Mamaduka commented Jun 14, 2023

talldan commented Jun 14, 2023

Mamaduka commented Jun 14, 2023

talldan commented Jun 14, 2023

github-actions bot commented Jun 14, 2023

kevin940726 commented Jun 14, 2023

kevin940726 commented Jun 14, 2023

talldan commented Jun 14, 2023

Mamaduka commented Jun 14, 2023

jeryj commented Jun 14, 2023 •

edited

Loading

talldan commented Jun 23, 2023

Update end to end test sharding on CI, make Playwright tests faster, Puppeteer tests slower #50362

Update end to end test sharding on CI, make Playwright tests faster, Puppeteer tests slower #50362

Conversation

talldan commented May 5, 2023 • edited Loading

What?

talldan commented May 5, 2023

kevin940726 left a comment

Choose a reason for hiding this comment

Mamaduka commented Jun 14, 2023

talldan commented Jun 14, 2023

Mamaduka commented Jun 14, 2023

talldan commented Jun 14, 2023

Mamaduka commented Jun 14, 2023

talldan commented Jun 14, 2023

github-actions bot commented Jun 14, 2023

kevin940726 commented Jun 14, 2023

kevin940726 commented Jun 14, 2023

talldan commented Jun 14, 2023

Mamaduka commented Jun 14, 2023

jeryj commented Jun 14, 2023 • edited Loading

talldan commented Jun 23, 2023

talldan commented May 5, 2023 •

edited

Loading

jeryj commented Jun 14, 2023 •

edited

Loading