-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process abort in state tests on some macOS versions with Rust 1.70 #6812
Comments
@mpguerra this is technically not a release blocker because it's on macOS. But crashes can be a sign of concurrency bugs or memory corruption. |
We could split this ticket into:
Because it's only tests that use the state that are failing on macOS. |
Hey team! Please add your planning poker estimate with Zenhub @arya2 @oxarbitrage @teor2345 @upbqdn |
@mpguerra I think this ticket should be split into two tickets: re-enabling builds, and re-enabling tests. They have different priorities and different estimates. |
I just checked building with my macos (Ventura 13.4.1) and the build works fine with Then, for my surprise, when i run I think it will worth a try to just enable what we disabled in the CI in a draft PR and see what happens there. |
Sure! If it all works in CI with the latest Rust compiler, then we don't have to split the tickets or do anything complicated. But I couldn't reproduce the CI bug on my local machine when I opened the ticket. So I wouldn't be surprised if it fails just in CI. (I've also added branch protection rules to the ticket.) |
Build and test also works locally on my macOS Ventura 13.6 M1 machine, with Rust 1.72.0. Let's see how CI goes. |
Looks like that worked! macOS seems to be the longest job though. Maybe that will go down once it builds on the main branch and its cache gets used. If it doesn't, we can open a devops ticket for larger macOS runners: |
fixed by #7843 |
It's down to 1 hour in the latest build, so I think we're fine here: |
Tasks
Motivation
Zebra's state tests are crashing on macOS across multiple PRs:
https://github.com/ZcashFoundation/zebra/actions/runs/5148228655/jobs/9269653348#step:15:3285
https://github.com/ZcashFoundation/zebra/actions/runs/5151593461/jobs/9276906631?pr=6810#step:15:3274
Investigation
This appears to be an environment issue, because it wasn't failing on the
main
branch when that code was originally merged:https://github.com/ZcashFoundation/zebra/actions/runs/5144319422/jobs/9260464558
Runners started being updated to new software versions on June 2:
actions/runner-images#7660
The macOS and image versions on failing and successful runners are the same:
https://github.com/ZcashFoundation/zebra/actions/runs/5144319422/jobs/9260464558#step:1:4
https://github.com/ZcashFoundation/zebra/actions/runs/5151593461/jobs/9276906631?pr=6810#step:1:4
But the Rust versions are different:
https://github.com/ZcashFoundation/zebra/actions/runs/5144319422/jobs/9260464558#step:5:18
https://github.com/ZcashFoundation/zebra/actions/runs/5151593461/jobs/9276906631?pr=6810#step:5:34
Teor can't reproduce this bug on their local machines with Rust 1.70, so it might be processor or OS-version dependent:
Diagnosis
We might need to make Zebra compatible with Rust 1.70 and later on macOS 12 and earlier.
We could disable these tests until we do, because macOS is not a supported Zebra platform.
Complex Code or Requirements
Usually this happens in the state due to RocksDB or state service shutdown or concurrency bugs.
Testing
Our existing tests seem to reliably detect this bug.
The text was updated successfully, but these errors were encountered: