Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random Testing on main and version branches #245

Closed
2 tasks
nknize opened this issue Aug 17, 2021 · 12 comments
Closed
2 tasks

Random Testing on main and version branches #245

nknize opened this issue Aug 17, 2021 · 12 comments
Assignees
Labels
cicd enhancement New Enhancement

Comments

@nknize
Copy link
Contributor

nknize commented Aug 17, 2021

OpenSearch org needs random testing on the main and supported unreleased minor version branches (e.g., 1x). Without it, PRs bear the burden of finding corner case bugs introduced in merged features.

This just occurred in opensearch-project/OpenSearch#1073 where SimpleFS deprecation PR failed twice w/ a corner case bug not caught in the original PR. Without random testing to catch corner cases like this the project is at the mercy of the law of large numbers where situations like these will occur more frequently and very few PR checks will succeed.

Outstanding tasks:

  • I will utilize the core team channel and @channel in slack msg, which already have the fork Jenkins links to run
  • In Git Issues I will include the error msg stacktrace, the seed, and the build number for now
@nknize nknize added enhancement New Enhancement v1.1.0 v2.0.0 untriaged Issues that have not yet been triaged and removed untriaged Issues that have not yet been triaged labels Aug 17, 2021
@dblock
Copy link
Member

dblock commented Aug 30, 2021

I'm a little worried that we are trying to include this in scope for 1.1, given that we're still not running any bundle tests (integ, perf, bcw) daily. Definitely can come next after those.

@nknize
Copy link
Contributor Author

nknize commented Aug 30, 2021

This is independent of any release. We don't have any random test coverage leading to PRs as the first (and only) line of defense for catching random seeded failures. We can't continue as a project like this. We need random testing on main, 1.x, (and 1.1 when we cut the branch) so we have adequate test coverage for "corner cases".

@nknize
Copy link
Contributor Author

nknize commented Aug 30, 2021

Case in point:

./gradlew ':qa:repository-multi-version:v7.10.0#Step3OldClusterTest' --tests "org.opensearch.upgrades.MultiVersionRepositoryAccessIT.testCreateAndRestoreSnapshot" -Dtests.seed=7303D305178F6209 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1" -Dtests.locale=hi -Dtests.timezone=Pacific/Apia -Druntime.java=15

I was able to reproduce this test failure on all three branches (including 1.0 release branch). Which means it was introduced sometime back before 1.0 GA. It was only just now caught because of the version bump PR. Random testing should've caught this far earlier than a PR.

@peternied
Copy link
Member

[Triage] @nknize do you have an approach of how this testing would be conducted - could this be done in a GHA or does this need support for a new kind of execution / reporting?

@nknize
Copy link
Contributor Author

nknize commented Sep 28, 2021

I was originally thinking like a temporal trigger (say every 4/6 hours or so; 4 times a day)? Perhaps that's too often? Maybe we run it twice a day? I would say we could be more conservative and set it up as an intake check (e.g., every merge)? But we don't have enough merged contributions for that to give the coverage needed, so maybe a 2/4 times a day random testing is good until we have a lot of merge activity and we switch to an intake check?

@peternied
Copy link
Member

Could these tests be run in GHA? It supports cron syntax as we use it in this repository.

@dblock
Copy link
Member

dblock commented Sep 28, 2021

There's just not enough hardware in GHA IMO, and Jenkins can do cron just fine too.

@peterzhuamazon
Copy link
Member

Added randomized test to gradle check, every 12 hours we will run gradle check against head of 1.x and main.
The failure msg will be sent to slack channel.

@dblock
Copy link
Member

dblock commented Oct 7, 2021

Where's the PR/comit/doc for this @peterzhuamazon ?

@peterzhuamazon
Copy link
Member

Success with Issue creation on OpenSearch repo:
opensearch-project/OpenSearch#1454

@peterzhuamazon
Copy link
Member

More actions after talking to DB:

  • I will utilize the core team channel and @channel in slack msg, which already have the fork Jenkins links to run
  • In Git Issues I will include the error msg stacktrace, the seed, and the build number for now

@peternied
Copy link
Member

[Triage] Many concerns of this task will be resolved when Jenkins in public opensearch-project/opensearch-ci#3, we will be prioritize that work first afterward this can be reopened if it comes up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cicd enhancement New Enhancement
Projects
None yet
Development

No branches or pull requests

4 participants