Random Testing on main and version branches #245

nknize · 2021-08-17T18:18:58Z

OpenSearch org needs random testing on the main and supported unreleased minor version branches (e.g., 1x). Without it, PRs bear the burden of finding corner case bugs introduced in merged features.

This just occurred in opensearch-project/OpenSearch#1073 where SimpleFS deprecation PR failed twice w/ a corner case bug not caught in the original PR. Without random testing to catch corner cases like this the project is at the mercy of the law of large numbers where situations like these will occur more frequently and very few PR checks will succeed.

Outstanding tasks:

I will utilize the core team channel and @channel in slack msg, which already have the fork Jenkins links to run
In Git Issues I will include the error msg stacktrace, the seed, and the build number for now

dblock · 2021-08-30T16:48:10Z

I'm a little worried that we are trying to include this in scope for 1.1, given that we're still not running any bundle tests (integ, perf, bcw) daily. Definitely can come next after those.

nknize · 2021-08-30T22:23:36Z

This is independent of any release. We don't have any random test coverage leading to PRs as the first (and only) line of defense for catching random seeded failures. We can't continue as a project like this. We need random testing on main, 1.x, (and 1.1 when we cut the branch) so we have adequate test coverage for "corner cases".

nknize · 2021-08-30T22:26:13Z

Case in point:

./gradlew ':qa:repository-multi-version:v7.10.0#Step3OldClusterTest' --tests "org.opensearch.upgrades.MultiVersionRepositoryAccessIT.testCreateAndRestoreSnapshot" -Dtests.seed=7303D305178F6209 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1" -Dtests.locale=hi -Dtests.timezone=Pacific/Apia -Druntime.java=15

I was able to reproduce this test failure on all three branches (including 1.0 release branch). Which means it was introduced sometime back before 1.0 GA. It was only just now caught because of the version bump PR. Random testing should've caught this far earlier than a PR.

peternied · 2021-09-28T19:11:54Z

[Triage] @nknize do you have an approach of how this testing would be conducted - could this be done in a GHA or does this need support for a new kind of execution / reporting?

nknize · 2021-09-28T19:29:51Z

I was originally thinking like a temporal trigger (say every 4/6 hours or so; 4 times a day)? Perhaps that's too often? Maybe we run it twice a day? I would say we could be more conservative and set it up as an intake check (e.g., every merge)? But we don't have enough merged contributions for that to give the coverage needed, so maybe a 2/4 times a day random testing is good until we have a lot of merge activity and we switch to an intake check?

peternied · 2021-09-28T20:25:33Z

Could these tests be run in GHA? It supports cron syntax as we use it in this repository.

dblock · 2021-09-28T21:03:54Z

There's just not enough hardware in GHA IMO, and Jenkins can do cron just fine too.

peterzhuamazon · 2021-10-07T18:27:27Z

Added randomized test to gradle check, every 12 hours we will run gradle check against head of 1.x and main.
The failure msg will be sent to slack channel.

dblock · 2021-10-07T19:22:33Z

Where's the PR/comit/doc for this @peterzhuamazon ?

peterzhuamazon · 2021-10-27T23:27:15Z

Success with Issue creation on OpenSearch repo:
opensearch-project/OpenSearch#1454

peterzhuamazon · 2021-10-28T16:07:50Z

More actions after talking to DB:

I will utilize the core team channel and @channel in slack msg, which already have the fork Jenkins links to run
In Git Issues I will include the error msg stacktrace, the seed, and the build number for now

peternied · 2021-11-08T19:55:35Z

[Triage] Many concerns of this task will be resolved when Jenkins in public opensearch-project/opensearch-ci#3, we will be prioritize that work first afterward this can be reopened if it comes up.

nknize added enhancement New Enhancement v1.1.0 v2.0.0 untriaged Issues that have not yet been triaged and removed untriaged Issues that have not yet been triaged labels Aug 17, 2021

mch2 mentioned this issue Sep 8, 2021

Run gradle checks on 1.x opensearch-project/OpenSearch#798

Closed

peternied removed v1.1.0 v2.0.0 labels Sep 28, 2021

peterzhuamazon mentioned this issue Oct 7, 2021

[Dummy] Test gradle check opensearch-project/OpenSearch#1341

Closed

5 tasks

peterzhuamazon added the cicd label Oct 7, 2021

abhinavGupta16 assigned peterzhuamazon Oct 25, 2021

peternied closed this as completed Nov 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random Testing on main and version branches #245

Random Testing on main and version branches #245

nknize commented Aug 17, 2021 •

edited by peternied

Loading

dblock commented Aug 30, 2021

nknize commented Aug 30, 2021

nknize commented Aug 30, 2021

peternied commented Sep 28, 2021

nknize commented Sep 28, 2021

peternied commented Sep 28, 2021

dblock commented Sep 28, 2021

peterzhuamazon commented Oct 7, 2021

dblock commented Oct 7, 2021

peterzhuamazon commented Oct 27, 2021

peterzhuamazon commented Oct 28, 2021

peternied commented Nov 8, 2021

Random Testing on main and version branches #245

Random Testing on main and version branches #245

Comments

nknize commented Aug 17, 2021 • edited by peternied Loading

dblock commented Aug 30, 2021

nknize commented Aug 30, 2021

nknize commented Aug 30, 2021

peternied commented Sep 28, 2021

nknize commented Sep 28, 2021

peternied commented Sep 28, 2021

dblock commented Sep 28, 2021

peterzhuamazon commented Oct 7, 2021

dblock commented Oct 7, 2021

peterzhuamazon commented Oct 27, 2021

peterzhuamazon commented Oct 28, 2021

peternied commented Nov 8, 2021

nknize commented Aug 17, 2021 •

edited by peternied

Loading