Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: use persistent disks for disk-stall tests #99747

Merged
merged 1 commit into from
Mar 28, 2023

Conversation

nicktrav
Copy link
Collaborator

@nicktrav nicktrav commented Mar 28, 2023

Currently, the disk-stall tests use local SSDs. When run on GCE VMs, a higher test flake rate is observed due to known issues with fsync latency for local SSDs.

Switch the test to use persistent disks instead.

Touches: #99372.

Release note: None.

Epic: CRDB-20293

Currently, the `disk-stall` tests use local SSDs. When run on GCE VMs, a
higher test flake rate is observed due to known issues with fsync
latency for local SSDs.

Switch the test to use persistent disks instead.

Touches: cockroachdb#99372.

Release note: None.
@nicktrav nicktrav requested review from jbowens and a team March 28, 2023 00:43
@nicktrav nicktrav requested a review from a team as a code owner March 28, 2023 00:43
@nicktrav nicktrav requested review from herkolategan and renatolabs and removed request for a team March 28, 2023 00:43
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

cc #97968

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan and @renatolabs)

@nicktrav
Copy link
Collaborator Author

TFTR!

bors r=jbowens

@nicktrav nicktrav added backport-22.1.x backport-23.1.x Flags PRs that need to be backported to 23.1 labels Mar 28, 2023
@craig
Copy link
Contributor

craig bot commented Mar 28, 2023

Build succeeded:

@craig craig bot merged commit 0745cd4 into cockroachdb:master Mar 28, 2023
nicktrav added a commit to nicktrav/cockroach that referenced this pull request Mar 28, 2023
Backups were inadvertently disabled in cockroachdb#99747 (to simplify testing). As
there is no good reason to run without them, re-enable the backups
during the test.

Release note: None.
@nicktrav nicktrav deleted the nickt.disk-stall-pds branch March 29, 2023 18:16
nicktrav added a commit to nicktrav/cockroach that referenced this pull request Mar 29, 2023
The disk-stalled roachtests were updated in cockroachdb#99747 to use local SSDs.
This change broke the `failover/*/disk-stall` tests, which look for
`/dev/sdb` on GCE (the used for GCE Persistent Disks), but the tests
still create clusters with local SSDs (the roachtest default).

Fix cockroachdb#99902.
Fix cockroachdb#99926.
Fix cockroachdb#99930.

Touches cockroachdb#97968.

Release note: None.
nicktrav added a commit to nicktrav/cockroach that referenced this pull request Mar 29, 2023
The disk-stalled roachtests were updated in cockroachdb#99747 to use PDs in favor
of local SSDs. This change broke the `failover/*/disk-stall` tests,
which look for `/dev/sdb` on GCE (the used for GCE Persistent Disks),
but the tests still create clusters with local SSDs (the roachtest
default).

Fix cockroachdb#99902.
Fix cockroachdb#99926.
Fix cockroachdb#99930.

Touches cockroachdb#97968.

Release note: None.
nicktrav added a commit to nicktrav/cockroach that referenced this pull request Mar 29, 2023
The disk-stalled roachtests were updated in cockroachdb#99747 to use PDs in favor
of local SSDs. This change broke the `failover/*/disk-stall` tests,
which look for `/dev/sdb` on GCE (the device name used for GCE
Persistent Disks), but the tests still create clusters with local SSDs
(the roachtest default).

Fix cockroachdb#99902.
Fix cockroachdb#99926.
Fix cockroachdb#99930.

Touches cockroachdb#97968.

Release note: None.
craig bot pushed a commit that referenced this pull request Mar 30, 2023
99871: roachtest: add back backups to disk-stalled tests r=jbowens a=nicktrav

Backups were inadvertently disabled in #99747 (to simplify testing). As there is no good reason to run without them, re-enable the backups during the test.

Release note: None.

Epic: CRDB-20293

99888: sql: mark index as GCed if table has been GCed in legacy gc path r=chengxiong-ruan a=chengxiong-ruan

Previously, if a table is GCed before an index is GCed by a TRUNCATE TABLE gc job, the TRUNCATE TABLE gc job can be stuck in running status because the table descriptor is missing. This is problematic because these jobs will never succeed and doing nothing. This commit marks the indexes as GCed if the descriptor cannot be found assuming that the table has been GCed. Also, table GC should have GCed all the indexes.

Note that this only affect the legacy GC path.

Epic: None

Release note (sql change): This commit fixes a bug where TRUNCATE TABLE gc job can be stuck in running status if table descriptor has been GCed before the truncated indexes are GCed. The bug was only a problem before `DelRange` is not available.

Co-authored-by: Nick Travers <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Mar 30, 2023
Backups were inadvertently disabled in #99747 (to simplify testing). As
there is no good reason to run without them, re-enable the backups
during the test.

Release note: None.
craig bot pushed a commit that referenced this pull request Mar 30, 2023
98899: feat: allow starting docker container via env variable r=rickystewart a=btkostner

Fixes #87043 by allowing you to specify args via the `COCKROACH_ARGS` env value instead of a command. This is required to be able to use the official cockroach image with GitHub actions via a service. More details in the issue.

Note, do to weirdness with merging commands and env values, I decided that setting the `COCKROACH_ARGS` would ignore any command given. This should reduce issues with people trying to use both ways of specifying args and instead force them to pick one.

Release note (general change): Allow setting docker command args via the `COCKROACH_ARGS` environment variable.

99607: sql: block DROP TENANT based on a session var r=stevendanna a=knz

Fixes #97972.
Epic: CRDB-23559

In clusters where we will promote tenant management operations, we would like to ensure there is one extra step needed for administrators to drop a tenant (and thus irremedially lose data). Given that `sql_safe_updates` is not set automatically when users open their SQL session using their own client, we need another mechanism.

This change introduces the new (hidden) session var, `disable_drop_tenant`. When set, tenant deletion fails with the following error message:

```
[email protected]:26257/movr> drop tenant foo;
ERROR: rejected (via sql_safe_updates or disable_drop_tenant): DROP TENANT causes irreversible data loss
SQLSTATE: 01000
```

(The session var `sql_safe_updates` is _also_ included as a blocker in the mechanism so that folk using `cockroach sql` get double protection).

The default value of this session var is `false` in single-tenant clusters, for compatibility with CC Serverless. It will be set to `true` via a config profile (#98466) when suitable.

Release note: None

99690: ui: drop index with space r=maryliag a=maryliag

Previously, if the index had a space on its name,
it would fail to drop.
This commit adds quotes so it can be executed.

Fixes #97988

Schema Insights
https://www.loom.com/share/04363b7f83484b5da19c760eb8d0de21

Table Details page
https://www.loom.com/share/1519b897a14440ddb066fb2ab03feb2d

Release note (bug fix): Index recommendation to DROP an index that have a space on its name can now be properly executed.

99750: sql: remove no longer used channel in createStatsNode r=yuzefovich a=yuzefovich

This hasn't been used as of fe6377c. Also mark `create_stats.go` as owned by SQL Queries.

Epic: None

Release note: None

99962: ui: add checks for values r=maryliag a=maryliag

Fixes #99655
Fixes #99538
Fixes #99539

Add checks to usages that could cause
`Cannot read properties of undefined`.

Release note: None

99963: roachtest: use local SSDs for disk-stall failover tests r=andrewbaptist a=nicktrav

The disk-stalled roachtests were updated in #99747 to use local SSDs. This change broke the `failover/*/disk-stall` tests, which look for `/dev/sdb` on GCE (the used for GCE Persistent Disks), but the tests still create clusters with local SSDs (the roachtest default).

Fix #99902.
Fix #99926.
Fix #99930.

Touches #97968.

Release note: None.

Co-authored-by: Blake Kostner <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: maryliag <[email protected]>
Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Nick Travers <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Mar 30, 2023
The disk-stalled roachtests were updated in #99747 to use PDs in favor
of local SSDs. This change broke the `failover/*/disk-stall` tests,
which look for `/dev/sdb` on GCE (the device name used for GCE
Persistent Disks), but the tests still create clusters with local SSDs
(the roachtest default).

Fix #99902.
Fix #99926.
Fix #99930.

Touches #97968.

Release note: None.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants