-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-22.1: roachtest: wait for ranges to replicate before filling disk #78538
Conversation
fe64189
to
c22a047
Compare
Thanks for opening a backport. Please check the backport criteria before merging:
If some of the basic criteria cannot be satisfied, ensure that the exceptional criteria are satisfied within.
Add a brief release justification to the body of your PR to justify this backport. Some other things to consider:
|
Currently, the `disk-full` roachtest creates a cluster and immediately places a ballast file on one node, which causes it to crash. If this node is the only replica for a range containing a system table, when the node crashes due to a full disk certain system queries may not complete. This results in the test being unable to make forward progress, as the one dead node prevents a system query from completing, and this query prevents the node from being restarted. Wait for all ranges to have at least two replicas before placing the ballast file on the one node. Touches #78337, #78270. Release note: None.
Improve on #78456 by waiting fro 3x replication, rather than 2x. Release note: None.
c22a047
to
554629c
Compare
@tbg - included the additional commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal, @nicktrav, and @tbg)
TFTR! |
Backport 1/1 commits from #78456 on behalf of @nicktrav.
/cc @cockroachdb/release
Currently, the
disk-full
roachtest creates a cluster and immediatelyplaces a ballast file on one node, which causes it to crash. If this
node is the only replica for a range containing a system table, when the
node crashes due to a full disk certain system queries may not complete.
This results in the test being unable to make forward progress, as the
one dead node prevents a system query from completing, and this query
prevents the node from being restarted.
Wait for all ranges to have at least two replicas before placing the
ballast file on the one node.
Closes #78337.
Release note: None.
Release justification: Test only.