Skip to content

Commit

Permalink
roachtest: wait for ranges to replicate before filling disk
Browse files Browse the repository at this point in the history
Currently, the `disk-full` roachtest creates a cluster and immediately
places a ballast file on one node, which causes it to crash. If this
node is the only replica for a range containing a system table, when the
node crashes due to a full disk certain system queries may not complete.
This results in the test being unable to make forward progress, as the
one dead node prevents a system query from completing, and this query
prevents the node from being restarted.

Wait for all ranges to have at least two replicas before placing the
ballast file on the one node.

Touches cockroachdb#78337, cockroachdb#78270.

Release note: None.
  • Loading branch information
nicktrav committed Mar 24, 2022
1 parent 88f84f5 commit 9492efd
Showing 1 changed file with 24 additions and 0 deletions.
24 changes: 24 additions & 0 deletions pkg/cmd/roachtest/tests/disk_full.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,30 @@ func registerDiskFull(r registry.Registry) {
c.Put(ctx, t.Cockroach(), "./cockroach", c.Range(1, c.Spec().NodeCount))
c.Start(ctx, t.L(), option.DefaultStartOpts(), install.MakeClusterSettings(), c.Range(1, nodes))

// Node 1 will soon be killed, when the ballast file fills up its disk. To
// ensure that the ranges containing system tables are available on other
// nodes, we wait here for at least two replicas of each range. Without
// this, it's possible that we end up deadlocked on a system query that
// requires a range on node 1, but node 1 will not restart until the query
// completes.
t.Status("awaiting replication")
{
db := c.Conn(ctx, t.L(), 1)
for {
var fullReplicated bool
if err := db.QueryRow(
"SELECT min(array_length(replicas, 1)) >= 2 FROM crdb_internal.ranges",
).Scan(&fullReplicated); err != nil {
t.Fatal(err)
}
if fullReplicated {
break
}
time.Sleep(time.Second)
}
_ = db.Close()
}

t.Status("running workload")
m := c.NewMonitor(ctx, c.Range(1, nodes))
m.Go(func(ctx context.Context) error {
Expand Down

0 comments on commit 9492efd

Please sign in to comment.