Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: expand WAL failover test assertions #121367

Merged
merged 1 commit into from
Apr 1, 2024

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Mar 29, 2024

Update the WAL failover disk stall roachtest to assert that the stalled store does failover to the secondary and that SQL tail latencies remain bounded.

Epic: none
Release note: none

@jbowens jbowens added the backport-24.1.x Flags PRs that need to be backported to 24.1. label Mar 29, 2024
@jbowens jbowens requested a review from a team March 29, 2024 16:30
@jbowens jbowens requested a review from a team as a code owner March 29, 2024 16:30
@jbowens jbowens requested review from sumeerbhola, DarrylWong and renatolabs and removed request for a team March 29, 2024 16:30
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @DarrylWong, @jbowens, and @renatolabs)


pkg/cmd/roachtest/tests/disk_stall.go line 184 at r1 (raw file):

	if durInFailover < 60*time.Second {
		t.Errorf("expected s1 to spend at least 60s writing to secondary, but spent %s", durInFailover)
	}

obvious question: do both assertions fail if failover is not configured?

Update the WAL failover disk stall roachtest to assert that the stalled store
does failover to the secondary and that SQL tail latencies remain bounded.

Epic: none
Release note: none
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r=sumeerbhola

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DarrylWong, @renatolabs, and @sumeerbhola)


pkg/cmd/roachtest/tests/disk_stall.go line 184 at r1 (raw file):

Previously, sumeerbhola wrote…

obvious question: do both assertions fail if failover is not configured?

Yeah, it does

19:45:05 test_runner.go:1055: [w0] --- FAIL: disk-stalled/wal-failover/among-stores (3976.90s)
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.013578081s at 2024-04-01T18:49:00Z
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.037223403s at 2024-04-01T18:59:00Z
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.012550625s at 2024-04-01T19:09:00Z
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.013074966s at 2024-04-01T19:19:00Z
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.014138031s at 2024-04-01T19:29:00Z
(disk_stall.go:183).runDiskStalledWALFailover: expected s1 to spend at least 60s writing to secondary, but spent 0s
(cluster.go:2344).Run: context canceled
test artifacts and logs in: artifacts/disk-stalled/wal-failover/among-stores/run_1
--- FAIL: disk-stalled/wal-failover/among-stores (3976.90s)
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.013578081s at 2024-04-01T18:49:00Z
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.037223403s at 2024-04-01T18:59:00Z
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.012550625s at 2024-04-01T19:09:00Z
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.013074966s at 2024-04-01T19:19:00Z
(disk_stall.go:174).runDiskStalledWALFailover: unexpectedly high p99.99 latency 1.014138031s at 2024-04-01T19:29:00Z
(disk_stall.go:183).runDiskStalledWALFailover: expected s1 to spend at least 60s writing to secondary, but spent 0s

@craig craig bot merged commit c43f54c into cockroachdb:master Apr 1, 2024
21 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-24.1.x Flags PRs that need to be backported to 24.1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants