Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable UTs for Windows while stablizing it #1202

Merged
merged 1 commit into from
Nov 11, 2022

Conversation

cliu123
Copy link
Member

@cliu123 cliu123 commented Nov 7, 2022

Signed-off-by: Chang Liu [email protected]

Description

UTs are unstable on Windows.

Category

[Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation] Maintenance

Issues Resolved

#1195

Testing

UTs, ITs

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@cliu123 cliu123 requested a review from a team November 7, 2022 22:02
@codecov-commenter
Copy link

Codecov Report

Merging #1202 (ba7f8a7) into main (b451cb2) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##             main    #1202   +/-   ##
=======================================
  Coverage   71.78%   71.78%           
=======================================
  Files          88       88           
  Lines        2027     2027           
  Branches      269      269           
=======================================
  Hits         1455     1455           
  Misses        509      509           
  Partials       63       63           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@cliu123 cliu123 added the backport 2.x backport to 2.x branch label Nov 7, 2022
Copy link
Member

@peternied peternied left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't feel right. The Windows platform is shipping in 2.4, disabling these tests opens shipping a platform build without knowing if its functional. Looking at the related issue there was only reported failure that seems network related, are there other issues?

@cliu123
Copy link
Member Author

cliu123 commented Nov 8, 2022

@peternied The open issue is in the PR description. Requiring an unstable workflow significantly slows down the PRs. The flaky workflow has to be re-run on multiple PRs, which are not right experience. I'd suggest to resolve the flaky tests before enabling the workflow.

@cliu123 cliu123 requested a review from peternied November 9, 2022 21:57
@cliu123
Copy link
Member Author

cliu123 commented Nov 10, 2022

@opensearch-project/security https://github.com/orgs/opensearch-project/teams/security Are we good to temporarily disable UTs on Windows until it gets stablized as it is flaky and needs re-trying on PRs.

@cliu123
Copy link
Member Author

cliu123 commented Nov 10, 2022

This doesn't feel right. The Windows platform is shipping in 2.4, disabling these tests opens shipping a platform build without knowing if its functional. Looking at the related issue there was only reported failure that seems network related, are there other issues?

@peternied BWC tests were disabled when they failed? How is this case different?

@peternied
Copy link
Member

BWC tests were disabled when they failed?

They BWC tests in the security repository have not been disabled, they continue to run and fail, but they are not required to merge changes. If you'd like to change the branch protections in this repository not to require windows tests to pass to merge I think that is a decent work around. If you do so please create a follow up issue to reenable the protections once it has stabilized [1]

How is this case different?

Updating network timeouts or retries on the yarn client are tweaks than can be made in this repository without any outside parties - I recommend we do that. The BWC failures in security are because of a problem in several repository, it cannot be resolved by our team/maintainers directly.

[1] opensearch-project/security#2235

@cliu123
Copy link
Member Author

cliu123 commented Nov 10, 2022

Thanks for the context! Removing Windows UTs from required checks.

I recommend we do that.

@peternied Would you mind taking on this as you have the most context on the UTs on Windows platform?

@peternied
Copy link
Member

@peternied Would you mind taking on this as you have the most context on the UTs on Windoes platform?

Thanks for your confidence but I cannot take this work on

@cliu123
Copy link
Member Author

cliu123 commented Nov 10, 2022

@peternied Would you mind taking on this as you have the most context on the UTs on Windoes platform?

Thanks for your confidence but I cannot take this work on

The error could help narrow down the root cause. @peternied If no plan to fix this, then I don't see value to keep it run for now. It should be disabled until it gets stablized.

2022-11-03T04:08:57.2968807Z info Visit https://yarnpkg.com/en/docs/cli/install for documentation about this command.
2022-11-03T04:08:57.5331813Z info There appears to be trouble with your network connection. Retrying...
2022-11-03T04:10:18.3867635Z ERROR [bootstrap] failed:
2022-11-03T04:10:18.3870994Z ERROR Error: Command failed with exit code 1: C:\npm\prefix\node_modules\yarn\bin\yarn.js install --non-interactive
2022-11-03T04:10:18.3872060Z           at makeError (D:\a\security-dashboards-plugin\security-dashboards-plugin\OpenSearch-Dashboards\packages\osd-pm\dist\index.js:25150:11)
2022-11-03T04:10:18.3879284Z           at handlePromise (D:\a\security-dashboards-plugin\security-dashboards-plugin\OpenSearch-Dashboards\packages\osd-pm\dist\index.js:24085:26)
2022-11-03T04:10:18.3880256Z           at processTicksAndRejections (internal/process/task_queues.js:95:5)
2022-11-03T04:10:18.3881649Z           at async installInDir (D:\a\security-dashboards-plugin\security-dashboards-plugin\OpenSearch-Dashboards\packages\osd-pm\dist\index.js:23821:3)
2022-11-03T04:10:18.3883826Z           at async Project.installDependencies (D:\a\security-dashboards-plugin\security-dashboards-plugin\OpenSearch-Dashboards\packages\osd-pm\dist\index.js:14780:5)
2022-11-03T04:10:18.3890831Z           at async Object.run (D:\a\security-dashboards-plugin\security-dashboards-plugin\OpenSearch-Dashboards\packages\osd-pm\dist\index.js:9003:11)
2022-11-03T04:10:18.3892253Z           at async runCommand (D:\a\security-dashboards-plugin\security-dashboards-plugin\OpenSearch-Dashboards\packages\osd-pm\dist\index.js:57870:5)
2022-11-03T04:10:18.3895322Z           at async Module.run (D:\a\security-dashboards-plugin\security-dashboards-plugin\OpenSearch-Dashboards\packages\osd-pm\dist\index.js:263:3)

@peternied
Copy link
Member

There are several maintainers on this project - maybe someone else can take this work on? @RyanL1997 do you think you could look into the root causes / fixes?

@RyanL1997
Copy link
Collaborator

RyanL1997 commented Nov 11, 2022

According to the investigation I made on this issue, here is the output from my side:

  1. I couldn't constantly reproduce this error.
  2. This a connection issue on the github server. There is nothing we can do to make their connection more stable.
  3. I'm also filling a issue for bug fix and enhancement request on github community to see if they can also offer some help. (see: https://github.com/orgs/community/discussions/38837)

However, I agree with that we can temporarily disable this test case for windows-latest until we figure out a fix/enhancement.

@dblock
Copy link
Member

dblock commented Nov 11, 2022

@RyanL1997 Thanks for articulating the actual problem, and for opening the discussion thread on GH.

@peternied Thanks for insisting on that we don't disable tests because some connectivity is flaky. This feels like the right thing to do.

I think a possible way forward is to keep running these tests and retrying them and seeing if the transient connectivity issue goes away. It's annoying, but it could just be something on the network that someone is working on. If it doesn't go away, I would change the GHA to retry the entire job (I don't know how to do that), and otherwise ignore the job failures on windows only, rather than disabling the tests. @cliu123 @peternied, what do you think? would the latter be acceptable?

@seraphjiang I suggest deleting your comment, and leaving this issue alone to be discussed by the contributors and maintainers of this project.

@cliu123
Copy link
Member Author

cliu123 commented Nov 11, 2022

@dblock I've already changed the branch protection policy. Merging PR no longer requires UTs on Windows to pass now. The workflow fails on most of PRs. As a maintainer, I would rather not to keep a flaky check running and merge things with failures. This would not be the best way to earn trust or confidence on quality from the community. But this is a team decision, if that's what we want to do, I'll leave it as is.

@cliu123
Copy link
Member Author

cliu123 commented Nov 11, 2022

I've discussed with @dblock offline. He no longer has objections on this and will let the maintainers move forward with this.

@cliu123
Copy link
Member Author

cliu123 commented Nov 11, 2022

created an issue to track fix and re-enabling separately: #1211

@cliu123 cliu123 dismissed peternied’s stale review November 11, 2022 21:32

Thanks for the review! I've discussed with @davidlago and @cwperks with this PR. I've created the issue to track fix.

@cliu123 cliu123 merged commit b04271a into opensearch-project:main Nov 11, 2022
@cliu123 cliu123 deleted the disable_UTs_on_Windows branch November 11, 2022 21:33
opensearch-trigger-bot bot pushed a commit that referenced this pull request Nov 11, 2022
Signed-off-by: Chang Liu <[email protected]>
(cherry picked from commit b04271a)
cliu123 added a commit that referenced this pull request Nov 11, 2022
Signed-off-by: Chang Liu <[email protected]>
(cherry picked from commit b04271a)

Co-authored-by: Chang Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x backport to 2.x branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants