-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keep track of background peer tasks #3253
Conversation
I'm marking this as draft because the tests are failing, and it depends on another PR merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good design, but I have some questions about type-safety and test coverage.
I'd like to merge the following PR before we merge this PR:
- to see coverage on this PR: Re-enable code coverage comments on PRs #3246
Do you think we should merge these test/fix PRs before the changes in this PR?
- test coverage: Add PeerSet readiness and request future cancel-safety tests #3252
peer-set related bug fixes: Fix task handling bugs, so peers are more likely to be available #3191
33ba182
to
69b3933
Compare
c174644
to
057d162
Compare
69b3933
to
d40ed78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to make it easier to maintain this code, and avoid the risk of subtle bugs due to future refactors.
I'll see if I can fix it today.
d40ed78
to
52e22e2
Compare
Move the code that's part of the heartbeat task into a separate helper function.
Keep it closer to where it's actually used, and make it easier to add new fields to `Client` for the connection and heartbeat tasks.
Prepare it to be able to check for panics or errors from the background tasks.
Spawn simple timeout tasks as mock connection and heartbeat tasks.
Building a `ClientTestHarness` requires a Tokio runtime to be set up, so the calls were moved into the `async` block.
Make the code reusable for both background tasks.
Periodically poll it to check if the task has unexpectedly stopped.
The client service should stop if the connection background task has exited, because then it's not able to receive any replies.
Wrap the background tasks in `Abortable`, so that they can be aborted through the `ClientTestHarness`.
Check that stopping the background connection task is something that the `Client` instance detects and handles correctly.
Check that stopping the background heartbeat task is something that the `Client` instance detects and handles correctly.
Will be used later to create background tasks that panic.
Use a mock background connection task that panics immediately, and check that the `Client` handles it gracefully.
Use a mock background heartbeat task that panics immediately, and check that the `Client` handles it gracefully.
The previously linked issue was a broad plan to improve Zebra's shutdown behavior, while the new issue is more specific, and can be scheduled sooner. Co-authored-by: teor <[email protected]>
52e22e2
to
235219b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks for the panic fixes!
Motivation
When setting up a
Client
service for a peer connection, some background tasks are spawned. These tasks canpanic
, and if the spawned task'sJoinHandle
isn't polled, Zebra might miss the error.This closes #3199, but currently depends on #3241 being merged first.
Solution
Store a
JoinHandle
for each task spawned for the peer connection inside theClient
type, and poll them insidepoll_ready
.Review
@teor2345
Reviewer Checklist
Follow Up Work
JoinHandle::abort
method instead?