Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent test panics in non_finalized_state::tests::prop::finalized_equals_pushed_genesis #6498

Closed
teor2345 opened this issue Apr 13, 2023 · 9 comments · Fixed by #6552
Closed
Assignees
Labels
A-concurrency Area: Async code, needs extra work to make it work properly. A-devops Area: Pipelines, CI/CD and Dockerfiles A-state Area: State / database changes C-bug Category: This is a bug C-testing Category: These are tests I-cost Zebra infrastructure costs I-integration-fail Continuous integration fails, including build and test failures I-panic Zebra panics with an internal error message S-needs-investigation Status: Needs further investigation S-needs-triage Status: A bug report needs triage

Comments

@teor2345
Copy link
Contributor

Motivation

We're seeing some failures in the merge queue due to an unrelated test bug:

---- service::non_finalized_state::tests::prop::finalized_equals_pushed_genesis stdout ----
The application panicked (crashed).
Message: only called while blocks is populated
Location: zebra-state/src/service/non_finalized_state/chain.rs:921

https://github.com/ZcashFoundation/zebra/actions/runs/4672510901/jobs/8274774308#step:14:3305

https://github.com/ZcashFoundation/zebra/actions/runs/4684761663/jobs/8301221723?pr=6496#step:14:2893

Analysis

This could be a test bug, or it could be revealing a possible panic in production.

We should check the code that's panicking to decide how important this issue is.

@mpguerra mpguerra added this to Zebra Apr 13, 2023
@github-project-automation github-project-automation bot moved this to 🆕 New in Zebra Apr 13, 2023
@teor2345 teor2345 added C-bug Category: This is a bug P-Medium ⚡ I-panic Zebra panics with an internal error message I-integration-fail Continuous integration fails, including build and test failures C-testing Category: These are tests A-state Area: State / database changes A-concurrency Area: Async code, needs extra work to make it work properly. A-devops Area: Pipelines, CI/CD and Dockerfiles S-needs-triage Status: A bug report needs triage labels Apr 13, 2023
@teor2345
Copy link
Contributor Author

I think we're going to need a full backtrace with debug info to diagnose this issue. In CI, I can't see the calling functions, so I don't know what code is actually causing this test failure.

@teor2345 teor2345 added the S-needs-investigation Status: Needs further investigation label Apr 13, 2023
@teor2345 teor2345 added the I-cost Zebra infrastructure costs label Apr 16, 2023
@teor2345
Copy link
Contributor Author

Failed PR #6515:
https://github.com/ZcashFoundation/zebra/actions/runs/4696422367/jobs/8327044365?pr=6515#step:3:13194

@mpguerra another test that is failing regularly and should be fixed so we can reliably merge PRs.

@teor2345
Copy link
Contributor Author

@teor2345
Copy link
Contributor Author

teor2345 commented Apr 16, 2023

We might be able to diagnose this issue by changing the panic to a test error, and then committing the proptest seed to our git repository.

If it's a reproducible error that only happens with some test data, it will happen all the time once we commit the seed. If it's a timing issue, then it won't happen reliably with the seed. (But we'll still have learnt something.)

@teor2345
Copy link
Contributor Author

@mpguerra
Copy link
Contributor

Let's devote some cycles this sprint to figuring out what's going on here

@teor2345
Copy link
Contributor Author

It's possible that the test doesn't wait long enough for the blocks to commit to the state. Or the blocks are actually invalid, so they get rejected, but we're not checking for block commit errors.

@arya2
Copy link
Contributor

arya2 commented Apr 20, 2023

Message: Test failed: only called while blocks is populated; minimal failing input: (chain, end_count, network, empty_tree) = (alloc::vec::Vec<zebra_state::request::PreparedBlock><zebra_state::request::PreparedBlock>, len=104, 103, Mainnet, HistoryTree(None))

Looks like a test bug, there won't be any blocks in the partial chain if end_count > chain.len() - 2.

@mpguerra
Copy link
Contributor

@arya2 can you please add a size for this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-concurrency Area: Async code, needs extra work to make it work properly. A-devops Area: Pipelines, CI/CD and Dockerfiles A-state Area: State / database changes C-bug Category: This is a bug C-testing Category: These are tests I-cost Zebra infrastructure costs I-integration-fail Continuous integration fails, including build and test failures I-panic Zebra panics with an internal error message S-needs-investigation Status: Needs further investigation S-needs-triage Status: A bug report needs triage
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants