Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: don't reuse clusters that call dmsetup #108115

Merged
merged 1 commit into from
Aug 7, 2023

Conversation

andrewbaptist
Copy link
Collaborator

Certain tests need to modify the blockdevice and they are prone to failures during setup that the device is still busy. Ideally we would figure out what is still holding onto the dish handle, but it is safer to simply not reuse clusters that perform this by adding spec.ReuseNone()

Fixes: #107865
Epic: none

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andrewbaptist andrewbaptist marked this pull request as ready for review August 3, 2023 18:41
@andrewbaptist andrewbaptist requested a review from a team as a code owner August 3, 2023 18:41
@andrewbaptist andrewbaptist requested review from herkolategan and smg260 and removed request for a team August 3, 2023 18:41
Copy link
Member

@srosenberg srosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist, @erikgrinaker, @herkolategan, and @smg260)


pkg/cmd/roachtest/tests/failover.go line 83 at r1 (raw file):

				Benchmark:           true,
				Timeout:             60 * time.Minute,
				Cluster:             r.MakeClusterSpec(10, spec.CPU(2), spec.PreferLocalSSD(false), spec.ReuseNone()), // uses disk stalls

Nit: remove everything but the cluster size, since they are applied below; otherwise, it's redundant.

Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

mount/unmount

nit: unmount/mount works just fine, it's dmsetup that fails. The filesystem isn't in use, the device is.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andrewbaptist, @herkolategan, @smg260, and @srosenberg)


pkg/cmd/roachtest/tests/failover.go line 83 at r1 (raw file):

Previously, srosenberg (Stan Rosenberg) wrote…

Nit: remove everything but the cluster size, since they are applied below; otherwise, it's redundant.

This is a different test than the ones below that use clusterOpts. This one doesn't need conditional options.


pkg/cmd/roachtest/tests/failover.go line 133 at r1 (raw file):

				// SSDs. See #97968.
				clusterOpts = append(clusterOpts, spec.PreferLocalSSD(false))
				clusterOpts = append(clusterOpts, spec.ReuseNone())

nit: let's add a comment for why we set this.

Certain tests need to modify the blockdevice and they are prone to
failures during setup that the device is still busy. Ideally we would
figure out what is still holding onto the dish handle, but it is safer
to simply not reuse clusters that perform this by adding
`spec.ReuseNone()`

Fixes: cockroachdb#107865
Epic: none

Release note: None
@andrewbaptist andrewbaptist force-pushed the 2023-08-03-noreuse-disk branch from ed7f423 to 704e6e9 Compare August 4, 2023 18:39
Copy link
Collaborator Author

@andrewbaptist andrewbaptist left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @herkolategan, @smg260, and @srosenberg)


pkg/cmd/roachtest/tests/failover.go line 83 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

This is a different test than the ones below that use clusterOpts. This one doesn't need conditional options.

Yep - as Erik mentioned, this test also needs it.


pkg/cmd/roachtest/tests/failover.go line 133 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

nit: let's add a comment for why we set this.

Added

@andrewbaptist andrewbaptist changed the title roachtest: don't reuse clusters that mount/unmount roachtest: don't reuse clusters that call dmsetup Aug 4, 2023
@andrewbaptist
Copy link
Collaborator Author

bors r=erikgrinaker, srosenberg

@craig
Copy link
Contributor

craig bot commented Aug 4, 2023

Build failed (retrying...):

craig bot pushed a commit that referenced this pull request Aug 4, 2023
108115: roachtest: don't reuse clusters that call dmsetup r=erikgrinaker,srosenberg a=andrewbaptist

Certain tests need to modify the blockdevice and they are prone to failures during setup that the device is still busy. Ideally we would figure out what is still holding onto the dish handle, but it is safer to simply not reuse clusters that perform this by adding `spec.ReuseNone()`

Fixes: #107865
Epic: none

Release note: None

108165: sql: fix insight integration test for contention r=koorosh a=koorosh

This change lowers the latency threshold for which statement is considered slow to make sure it detects contention.
Previous value was set to 100ms which is default. New value is set to 30ms to be the same as overridden
value in other tests.

Release note: None

Resolves: #108020

108192: ccl/sqlproxyccl: deflake TestConnectionMigration r=darinpp a=jaylim-crl

Fixes #106885.

This test flake seems extremely rare, and it's unclear why it occurred in the
first place. The past 1000 runs (all of what TC has) have been successful.
Regardless, this commit attempts at deflaking TestConnectionMigration. Given
that some subtests transfer connections through `transferConnWithRetries`, it
is possible that the transfer was retried, causing the metric to be
incremented.

Release note: None

Epic: none

Co-authored-by: Andrew Baptist <[email protected]>
Co-authored-by: Andrii Vorobiov <[email protected]>
Co-authored-by: Jay <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 4, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Aug 4, 2023

Build failed:

@erikgrinaker
Copy link
Contributor

bors retry

@craig
Copy link
Contributor

craig bot commented Aug 7, 2023

Build succeeded:

@craig craig bot merged commit 3e7991d into cockroachdb:master Aug 7, 2023
@erikgrinaker
Copy link
Contributor

blathers backport 23.1

@blathers-crl
Copy link

blathers-crl bot commented Aug 7, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 704e6e9 to blathers/backport-release-23.1-108115: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 23.1 failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: failover/chaos/read-only failed
4 participants