Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update end to end test sharding on CI, make Playwright tests faster, Puppeteer tests slower #50362

Merged
merged 3 commits into from
Jun 23, 2023

Conversation

talldan
Copy link
Contributor

@talldan talldan commented May 5, 2023

What?

As more end to end tests are migrated to Playwright (and new tests are added), those jobs are naturally taking longer. (around 30-35 minutes)

Similarly as there are fewer tests running using Puppeteer those are becoming quicker. (around 10-15 minutes)

This adjusts the jobs so that there's more sharding for Playwright, now tests are split into 4 sets. Puppeteer tests are now reduced to 3 sets.

By my calculations, it should make both Puppeteer and Playwright tests complete in around 20 minutes, so they'll both be quicker than Performance tests again.

@talldan talldan added [Type] Task Issues or PRs that have been broken down into an individual action to take [Type] Automated Testing Testing infrastructure changes impacting the execution of end-to-end (E2E) and/or unit tests. labels May 5, 2023
@talldan talldan requested review from gziolo and kevin940726 May 5, 2023 06:41
@talldan talldan self-assigned this May 5, 2023
@talldan
Copy link
Contributor Author

talldan commented May 5, 2023

I wonder why Playwright - 4 is taking so much longer.

It might be good to identify which particular tests are slow and try to improve that.

Copy link
Member

@kevin940726 kevin940726 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me! Thanks!

@Mamaduka
Copy link
Member

Are there any blockers for this PR?

@talldan
Copy link
Contributor Author

talldan commented Jun 14, 2023

@Mamaduka I think it's ok. I'd just noticed that one of the shards was still taking a long time, so I wondered why they weren't more evenly balanced. I couldn't spot any obvious reasons.

I need to revert 14b78c3, then it should be ok to merge, at which point the required checks will have to be updated in github.

@talldan talldan force-pushed the update/e2e-test-sharding branch from 14b78c3 to d8ef29d Compare June 14, 2023 06:03
@Mamaduka
Copy link
Member

I guess a few tests that end up in 4th shard are a bit slow. Maybe in the future, we can monitor/track slow tests and improve as needed.

@jeryj did that recently in #50681.

@talldan
Copy link
Contributor Author

talldan commented Jun 14, 2023

I guess a few tests that end up in 4th shard are a bit slow. Maybe in the future, we can monitor/track slow tests and improve as needed.

In that respect it's probably good to ensure the test files don't get too big, and break them up across multiple files when they do. That would help the sharding work better, as one shard wouldn't spend too long on any individual file.

I personally don't really like the idea of writing longer test cases as they're hard to read and reason about and difficult to refactor.

@Mamaduka
Copy link
Member

I personally don't really like the idea of writing longer test cases as they're hard to read and reason about and difficult to refactor.

It depends on the case, like everything, I guess 😄 But yes, I'm also in the camp "write small and atomic tests".

@talldan
Copy link
Contributor Author

talldan commented Jun 14, 2023

Yeah, still seeing the same where Playwright 4 is taking much longer—10 minutes longer than Playwright 1.

It's noticeable that one of the tests outputs a lot of warning messages (I think it's one of the tests that runs in firefox), so it could be that this makes it run more slowly.

@github-actions
Copy link

Flaky tests detected in d8ef29d.
Some tests passed with failed attempts. The failures may not be related to this commit but are still reported for visibility. See the documentation for more information.

🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/5263616230
📝 Reported issues:

@kevin940726
Copy link
Member

I think sharding splits by test files but not test cases. So unless we split a big test file into multiple smaller ones, writing multiple shorter test cases would probably just make it slower 🤔 .

If that ever becomes a bottleneck then perhaps we can figure out a quick convention so that others can follow?

@kevin940726
Copy link
Member

It's noticeable that one of the tests outputs a lot of warning messages (I think it's one of the tests that runs in firefox), so it could be that this makes it run more slowly.

I think that's the case here.

A possible follow-up would be to split firefox and webkit testing to a different shard.

@talldan
Copy link
Contributor Author

talldan commented Jun 14, 2023

So unless we split a big test file into multiple smaller ones, writing multiple shorter test cases would probably just make it slower 🤔 .

Yeah, I'm mainly advocating for breaking things down into shorter test files where it's logical (e.g. when a test has separate describe blocks with distinct setup and teardown).

The comment about shorter test cases was more about maintainability. I realise that will make things slower, but I think there's a balance to be had.

@Mamaduka
Copy link
Member

I think we're onto something here. When listing the tests run by the last shard, most of the tests running in two browsers end up there.

It makes sense that tests running in multiple browsers will take longer than running in one.

You can check this by following the command(s):

# on trunk
npm run test:e2e:playwright -- --shard=2/2 --list

# on this branch
npm run test:e2e:playwright -- --shard=4/4 --list

@jeryj
Copy link
Contributor

jeryj commented Jun 14, 2023

I personally don't really like the idea of writing longer test cases as they're hard to read and reason about and difficult to refactor.

I completely agree for unit tests. I've found for e2e tests that they're a lot easier to work with when combined in a sensible way because they run so much faster and are more effective at catching bugs when they replicate a real user flow. This also helps document the expected behavior in a way that I think is easier to understand. The VS Code Playwright tooling around it is also quite good for stepping through the e2e tests for debugging.

There's definitely a careful balance to which tests make sense to combine together to still keep it isolated enough to understand, debug, and not be flaky. I totally hear that point. Also, I don't mean to make this into a debate around e2e test length, but wanted to share from the perspective of someone who was wholeheartedly in the "keep everything isolated" camp and have shifted my philosophy around it since working with e2e tests.

@talldan
Copy link
Contributor Author

talldan commented Jun 23, 2023

I'll merge this as tests are failing in trunk (though just fixed by #51826) and will need a rebase anyway, so it seems like a good opportunity.

@talldan talldan merged commit 5fa3f81 into trunk Jun 23, 2023
@talldan talldan deleted the update/e2e-test-sharding branch June 23, 2023 03:36
@github-actions github-actions bot added this to the Gutenberg 16.1 milestone Jun 23, 2023
sethrubenstein pushed a commit to pewresearch/gutenberg that referenced this pull request Jul 13, 2023
…Puppeteer tests slower (WordPress#50362)

* Increase playwright shards to 4, decrease puppeteer to 3

* Temporarily enable list reporter

* Revert "Temporarily enable list reporter"

This reverts commit 14b78c3.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Type] Automated Testing Testing infrastructure changes impacting the execution of end-to-end (E2E) and/or unit tests. [Type] Task Issues or PRs that have been broken down into an individual action to take
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants