-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(hang): Stop blocking some Zebra futures for up to a minute using a CPU busy-loop, Credit: Ziggurat Team (#6763), james_katz (#7000) #7103
Conversation
@mpguerra is it ok if we get this merged before the release? It would finish off all our current network bugs, and get Zebra ready for re-testing. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #7103 +/- ##
==========================================
- Coverage 77.38% 77.35% -0.03%
==========================================
Files 310 310
Lines 41795 41796 +1
==========================================
- Hits 32343 32332 -11
- Misses 9452 9464 +12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I was going to check this! |
04c49f5
to
45dbb42
Compare
I got the elapsed time calculation wrong, the tests should be fixed now. |
* Update license description in README for MIT-only crates * Draft changelog with trivial issues * Remove trivial issues * Update changelog entries as of commit 2a31972 and PR #7103 * Update mainnet and testnet checkpoints as of 2023-06-30 * chore: Release * Estimate release height for Zebra v1.0.1 Block height 2139118 at 2023-06-30 01:55:38 UTC Release is likely to be 2023-07-01 2139118 + 1152 * 3 = 2142574 Then round up to the nearest 1000.
Motivation
Since PR #6235 was merged in March 2023, Zebra's progress logging future has been busy-waiting in a loop for at least 55 seconds out of every minute.
Since this loop doesn't call into the
tokio
executor, it never yields, so it blocks all other futures running on that thread.This is a likely cause for these bugs:
Close #6763
Close #7000
It might also possibly be causing these bugs, or making them worse:
Specifications
https://docs.rs/tokio/latest/tokio/#cpu-bound-tasks-and-blocking-code
Complex Code or Requirements
tokio
uses cooperative multitasking with internal yields, but onlytokio
APIs can yield totokio
:https://docs.rs/tokio/latest/tokio/task/index.html#cooperative-scheduling
Solution
Sleep before continuing the loop.
Use monotonic
Instant
times, so that changing the computer's clock doesn't cause progress updates to stop.Review
This seems like it might be a release blocker, given the number of bugs it impacts.
Reviewer Checklist
Follow Up Work
We should ask users to re-test and see if any of these bugs are still happening.