Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_runner: automatically rerun flaky tests #3880

Merged
merged 4 commits into from
Apr 4, 2023

Conversation

bayandin
Copy link
Member

@bayandin bayandin commented Mar 25, 2023

Describe your changes

This PR adds a plugin that automatically reruns (up to 3 times) flaky tests. Internally, it uses data from TEST_RESULT_CONNSTR database and pytest-rerunfailures plugin.

We think the test is flaky if it has different statuses (passed and failed) for the same revision for test runs on main branch.

Flaky tests are fetched by scripts/flaky_tests.py script (it's possible to use it in a standalone mode to learn which tests are flaky), stored to a JSON file and then the file is passed to the pytest plugin.

$ poetry run scripts/flaky_tests.py postgres://soft-base-394976.cloud.neon.tech:5432/main
connecting to the database...
fetching flaky tests...
	test_runner/performance/test_branch_creation.py::test_branch_creation_heavy_write[20]
	test_runner/regress/test_compatibility.py::test_forward_compatibility
	test_runner/regress/test_ondemand_download.py::test_ondemand_download_large_rel[real_s3]
	test_runner/regress/test_ondemand_download.py::test_ondemand_download_timetravel[real_s3]
	test_runner/regress/test_remote_storage.py::test_remote_storage_backup_and_restore[real_s3]
	test_runner/regress/test_tenant_detach.py::test_detach_while_attaching[real_s3]
	test_runner/regress/test_tenant_detach.py::test_tenant_reattach[real_s3]
	test_runner/regress/test_tenant_size.py::test_get_tenant_size_with_multiple_branches
	test_runner/regress/test_tenants.py::test_pageserver_with_empty_tenants[real_s3]
	test_runner/regress/test_wal_acceptor.py::test_wal_backup[real_s3]
saving results to flaky.json

Likely enough, the PR test run encountered such a flaky test, and it was run thrice: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-3880/debug/4542694447/index.html#categories/f08a7e35e427fe455a88f4ee0deda892/20612f8b60a53a90/retries

Issue ticket number and link

N/A

Checklist before requesting a review

  • I have performed a self-review of my code.
  • If it is a core feature, I have added thorough tests.
  • Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
  • If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

@bayandin bayandin force-pushed the bayandin/rerun-flaky-tests branch 14 times, most recently from 7a04c65 to b058cba Compare March 27, 2023 23:27
@bayandin bayandin changed the title WIP: test_runner: rerun flaky tests test_runner: automatically rerun flaky tests Mar 27, 2023
@arssher
Copy link
Contributor

arssher commented Mar 28, 2023

Hm, I'm not aware of test_wal_backup being flaky. Seems like this queries allure db runs on main? Is there a quick way to see those runs in allure?

@bayandin
Copy link
Member Author

Hm, I'm not aware of test_wal_backup being flaky.

Taking into account that the most of flaky tests related to real_s3 it could've caused by the network or S3 itself

Seems like this queries allure db runs on main?

Yes, just to avoid experiments affecting the statistics

Is there a quick way to see those runs in allure?

https://neon-github-public-dev.s3.amazonaws.com/reports/main/debug/4539605341/index.html#suites/82004ab4e3720b47bf78f312dabe7c55/dad923cb652c5a2c/history

@bayandin bayandin force-pushed the bayandin/rerun-flaky-tests branch 3 times, most recently from 76b1b83 to 9d45cd9 Compare March 28, 2023 11:49
@bayandin bayandin marked this pull request as ready for review March 28, 2023 13:00
@bayandin bayandin force-pushed the bayandin/rerun-flaky-tests branch 2 times, most recently from 84f8316 to 1da6a07 Compare March 30, 2023 21:04
@bayandin bayandin requested a review from vadim2404 March 30, 2023 21:07
scripts/flaky_tests.py Outdated Show resolved Hide resolved
scripts/flaky_tests.py Outdated Show resolved Hide resolved
scripts/flaky_tests.py Outdated Show resolved Hide resolved
@koivunej
Copy link
Member

Re: #3880 (comment)

Hm, I'm not aware of test_wal_backup being flaky.

Taking into account that the most of flaky tests related to real_s3 it could've caused by the network or S3 itself

I think we have a problem with many tests running into the 10s shutdown timeout with real_s3 while at the same time having a preceding wait_for_upload or similar with a larger timeout because there's a race that makes wait_for_upload (or similar) complete instantly, even though we will soon compact and need to upload a bunch of stuff. Context: #3697 (comment) but I could be wrong here just as well, had to context switch and have just been linking back to the comment :)

Yeah kind of makes me worried that we will miss out on issues like this, assuming it's a real issue. We would still see the retried tests in some dashboard?

@bayandin bayandin force-pushed the bayandin/rerun-flaky-tests branch from 1da6a07 to c8f5d02 Compare April 2, 2023 14:54
@bayandin
Copy link
Member Author

bayandin commented Apr 2, 2023

We would still see the retried tests in some dashboard?

Tests that we automatically reran will be posted to the PR comment (after #3907)
Later I'll add them to Grafana Dashboard

@bayandin bayandin requested a review from vadim2404 April 4, 2023 10:25
@bayandin bayandin merged commit 105b8bb into main Apr 4, 2023
@bayandin bayandin deleted the bayandin/rerun-flaky-tests branch April 4, 2023 11:21
ansrivas pushed a commit to ansrivas/neon that referenced this pull request Apr 11, 2023
Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.7.0 to
3.7.4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/aio-libs/aiohttp/releases">aiohttp's
releases</a>.</em></p>
<blockquote>
<h2>aiohttp 3.7.3 release</h2>
<h2>Features</h2>
<ul>
<li>Use Brotli instead of brotlipy
<code>[neondatabase#3803](aio-libs/aiohttp#3803)
&lt;https://github.com/aio-libs/aiohttp/issues/3803&gt;</code>_</li>
<li>Made exceptions pickleable. Also changed the repr of some
exceptions.
<code>[neondatabase#4077](aio-libs/aiohttp#4077)
&lt;https://github.com/aio-libs/aiohttp/issues/4077&gt;</code>_</li>
</ul>
<h2>Bugfixes</h2>
<ul>
<li>Raise a ClientResponseError instead of an AssertionError for a blank
HTTP Reason Phrase.
<code>[neondatabase#3532](aio-libs/aiohttp#3532)
&lt;https://github.com/aio-libs/aiohttp/issues/3532&gt;</code>_</li>
<li>Fix <code>web_middlewares.normalize_path_middleware</code> behavior
for patch without slash.
<code>[neondatabase#3669](aio-libs/aiohttp#3669)
&lt;https://github.com/aio-libs/aiohttp/issues/3669&gt;</code>_</li>
<li>Fix overshadowing of overlapped sub-applications prefixes.
<code>[neondatabase#3701](aio-libs/aiohttp#3701)
&lt;https://github.com/aio-libs/aiohttp/issues/3701&gt;</code>_</li>
<li>Make <code>BaseConnector.close()</code> a coroutine and wait until
the client closes all connections. Drop deprecated &quot;with
Connector():&quot; syntax.
<code>[neondatabase#3736](aio-libs/aiohttp#3736)
&lt;https://github.com/aio-libs/aiohttp/issues/3736&gt;</code>_</li>
<li>Reset the <code>sock_read</code> timeout each time data is received
for a <code>aiohttp.client</code> response.
<code>[neondatabase#3808](aio-libs/aiohttp#3808)
&lt;https://github.com/aio-libs/aiohttp/issues/3808&gt;</code>_</li>
<li>Fixed type annotation for add_view method of UrlDispatcher to accept
any subclass of View
<code>[neondatabase#3880](aio-libs/aiohttp#3880)
&lt;https://github.com/aio-libs/aiohttp/issues/3880&gt;</code>_</li>
<li>Fixed querying the address families from DNS that the current host
supports.
<code>[neondatabase#5156](aio-libs/aiohttp#5156)
&lt;https://github.com/aio-libs/aiohttp/issues/5156&gt;</code>_</li>
<li>Change return type of MultipartReader.<strong>aiter</strong>() and
BodyPartReader.<strong>aiter</strong>() to AsyncIterator.
<code>[neondatabase#5163](aio-libs/aiohttp#5163)
&lt;https://github.com/aio-libs/aiohttp/issues/5163&gt;</code>_</li>
<li>Provide x86 Windows wheels.
<code>[neondatabase#5230](aio-libs/aiohttp#5230)
&lt;https://github.com/aio-libs/aiohttp/issues/5230&gt;</code>_</li>
</ul>
<h2>Improved Documentation</h2>
<ul>
<li>Add documentation for <code>aiohttp.web.FileResponse</code>.
<code>[neondatabase#3958](aio-libs/aiohttp#3958)
&lt;https://github.com/aio-libs/aiohttp/issues/3958&gt;</code>_</li>
<li>Removed deprecation warning in tracing example docs
<code>[neondatabase#3964](aio-libs/aiohttp#3964)
&lt;https://github.com/aio-libs/aiohttp/issues/3964&gt;</code>_</li>
<li>Fixed wrong &quot;Usage&quot; docstring of
<code>aiohttp.client.request</code>.
<code>[neondatabase#4603](aio-libs/aiohttp#4603)
&lt;https://github.com/aio-libs/aiohttp/issues/4603&gt;</code>_</li>
<li>Add aiohttp-pydantic to third party libraries
<code>[neondatabase#5228](aio-libs/aiohttp#5228)
&lt;https://github.com/aio-libs/aiohttp/issues/5228&gt;</code>_</li>
</ul>
<h2>Misc</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst">aiohttp's
changelog</a>.</em></p>
<blockquote>
<h1>3.7.4 (2021-02-25)</h1>
<h2>Bugfixes</h2>
<ul>
<li>
<p><strong>(SECURITY BUG)</strong> Started preventing open redirects in
the
<code>aiohttp.web.normalize_path_middleware</code> middleware. For
more details, see
<a
href="https://github.com/aio-libs/aiohttp/security/advisories/GHSA-v6wp-4m6f-gcjg">https://github.com/aio-libs/aiohttp/security/advisories/GHSA-v6wp-4m6f-gcjg</a>.</p>
<p>Thanks to <code>Beast Glatisant
&lt;https://github.com/g147&gt;</code>__ for
finding the first instance of this issue and <code>Jelmer Vernooij
&lt;https://jelmer.uk/&gt;</code>__ for reporting and tracking it down
in aiohttp.
<code>[neondatabase#5497](aio-libs/aiohttp#5497)
&lt;https://github.com/aio-libs/aiohttp/issues/5497&gt;</code>_</p>
</li>
<li>
<p>Fix interpretation difference of the pure-Python and the Cython-based
HTTP parsers construct a <code>yarl.URL</code> object for HTTP
request-target.</p>
<p>Before this fix, the Python parser would turn the URI's absolute-path
for <code>//some-path</code> into <code>/</code> while the Cython code
preserved it as
<code>//some-path</code>. Now, both do the latter.
<code>[neondatabase#5498](aio-libs/aiohttp#5498)
&lt;https://github.com/aio-libs/aiohttp/issues/5498&gt;</code>_</p>
</li>
</ul>
<hr />
<h1>3.7.3 (2020-11-18)</h1>
<h2>Features</h2>
<ul>
<li>Use Brotli instead of brotlipy
<code>[neondatabase#3803](aio-libs/aiohttp#3803)
&lt;https://github.com/aio-libs/aiohttp/issues/3803&gt;</code>_</li>
<li>Made exceptions pickleable. Also changed the repr of some
exceptions.
<code>[neondatabase#4077](aio-libs/aiohttp#4077)
&lt;https://github.com/aio-libs/aiohttp/issues/4077&gt;</code>_</li>
</ul>
<h2>Bugfixes</h2>
<ul>
<li>Raise a ClientResponseError instead of an AssertionError for a blank
HTTP Reason Phrase.
<code>[neondatabase#3532](aio-libs/aiohttp#3532)
&lt;https://github.com/aio-libs/aiohttp/issues/3532&gt;</code>_</li>
<li>Fix <code>web_middlewares.normalize_path_middleware</code> behavior
for patch without slash.
<code>[neondatabase#3669](aio-libs/aiohttp#3669)
&lt;https://github.com/aio-libs/aiohttp/issues/3669&gt;</code>_</li>
<li>Fix overshadowing of overlapped sub-applications prefixes.
<code>[neondatabase#3701](aio-libs/aiohttp#3701)
&lt;https://github.com/aio-libs/aiohttp/issues/3701&gt;</code>_</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/0a26acc1de9e1b0244456b7881ec16ba8bb64fc3"><code>0a26acc</code></a>
Bump aiohttp to v3.7.4 for a security release</li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/021c416c18392a111225bc7326063dc4a99a5138"><code>021c416</code></a>
Merge branch 'GHSA-v6wp-4m6f-gcjg' into master</li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/4ed7c25b537f71c6245bb74d6b20e5867db243ab"><code>4ed7c25</code></a>
Bump chardet from 3.0.4 to 4.0.0 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5333">#5333</a>)</li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/b61f0fdffc887df24244ba7bdfe8567c580240ff"><code>b61f0fd</code></a>
Fix how pure-Python HTTP parser interprets <code>//</code></li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/5c1efbc32c46820250bd25440bb7ea96cb05abe9"><code>5c1efbc</code></a>
Bump pre-commit from 2.9.2 to 2.9.3 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5322">#5322</a>)</li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/007507580137efcc0a20391a0792f39b337d9c1a"><code>0075075</code></a>
Bump pygments from 2.7.2 to 2.7.3 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5318">#5318</a>)</li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/5085173d947e6cc01b6daf1aa48fe7698834c569"><code>5085173</code></a>
Bump multidict from 5.0.2 to 5.1.0 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5308">#5308</a>)</li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/5d1a75e68d278c641c90021409f4eb5de1810e5e"><code>5d1a75e</code></a>
Bump pre-commit from 2.9.0 to 2.9.2 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5290">#5290</a>)</li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/6724d0e7a944fd7e3a710dc292d785fa8fe424fd"><code>6724d0e</code></a>
Bump pre-commit from 2.8.2 to 2.9.0 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5273">#5273</a>)</li>
<li><a
href="https://github.com/aio-libs/aiohttp/commit/c688451ce31b914c71b11d2ac6c326b0c87e6d1f"><code>c688451</code></a>
Removed duplicate timeout parameter in ClientSession reference docs. (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5262">#5262</a>)
...</li>
<li>Additional commits viewable in <a
href="https://github.com/aio-libs/aiohttp/compare/v3.7.0...v3.7.4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=aiohttp&package-manager=pip&previous-version=3.7.0&new-version=3.7.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
- `@dependabot use these labels` will set the current labels as the
default for future PRs for this repo and language
- `@dependabot use these reviewers` will set the current reviewers as
the default for future PRs for this repo and language
- `@dependabot use these assignees` will set the current assignees as
the default for future PRs for this repo and language
- `@dependabot use this milestone` will set the current milestone as the
default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/neondatabase/neon/network/alerts).

</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Vadim Kharitonov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants