Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ci): Increase test launch and shutdown times to fix CI failures #7499

Merged
merged 2 commits into from
Sep 7, 2023

Conversation

teor2345
Copy link
Contributor

@teor2345 teor2345 commented Sep 5, 2023

Motivation

I've seen at least 3 CI failures due to slow Zebra startup or shutdown over the last few days:

  • rpc_conflict_test
  • non_blocking_logger

In one of these failures, the RPC sever task doesn't seem to run at all, so I added extra logging to help diagnose that.

Close #7498.
(If this fix doesn't work we can re-open that ticket.)

Solution

  • Make the tests wait longer for startup and shutdown
  • Add logs to the start command for every spawned and shut down task

Review

This is causing CI failures every 1-2 days, so it's urgent.

Reviewer Checklist

  • Are the PR labels correct?
  • Does the code do what the ticket and PR says?
    • Does it change concurrent code, unsafe code, or consensus rules?
  • How do you know it works? Does it have tests?

@teor2345 teor2345 added C-bug Category: This is a bug P-High 🔥 I-integration-fail Continuous integration fails, including build and test failures A-diagnostics Area: Diagnosing issues or monitoring performance A-rpc Area: Remote Procedure Call interfaces C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG A-concurrency Area: Async code, needs extra work to make it work properly. labels Sep 5, 2023
@teor2345 teor2345 requested a review from a team as a code owner September 5, 2023 23:26
@teor2345 teor2345 self-assigned this Sep 5, 2023
@teor2345 teor2345 requested review from upbqdn and removed request for a team September 5, 2023 23:26
mergify bot added a commit that referenced this pull request Sep 6, 2023
@mergify mergify bot merged commit 29d8e90 into main Sep 7, 2023
@mergify mergify bot deleted the fix-slow-test-failures branch September 7, 2023 00:04
@upbqdn upbqdn mentioned this pull request Sep 22, 2023
38 tasks
arya2 pushed a commit that referenced this pull request Sep 29, 2023
…7499)

* Increase launch times to help fix failures in rpc_conflict and non_blocking_logger tests

* Add extra task spawn and shutdown logs to start.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-concurrency Area: Async code, needs extra work to make it work properly. A-diagnostics Area: Diagnosing issues or monitoring performance A-rpc Area: Remote Procedure Call interfaces C-bug Category: This is a bug C-trivial Category: A trivial change that is not worth mentioning in the CHANGELOG I-integration-fail Continuous integration fails, including build and test failures
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: RPC task is very slow or hangs before spawning RPC server in tests
2 participants