-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roach{prod,test}: add first-class support for disk snapshots #103757
roach{prod,test}: add first-class support for disk snapshots #103757
Conversation
Pure code movement. We'll make use of it outside this file in subsequent commits. Release note: None
These tests have been stable for a few months now. Reduce to a weekly cadence. Release note: None
0a8ba7a
to
f13b01d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Doubtlessly will result in significant time (and cost) savings. Just minor observations/questions from me.
Reviewed 2 of 21 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @srosenberg, and @sumeerbhola)
pkg/cmd/roachtest/cluster/cluster_interface.go
line 144 at r1 (raw file):
// prefix. These snapshots can later be retrieved, deleted or applied to // already instantiated clusters. CreateSnapshot(ctx context.Context, snapshotPrefix string) error
Seems like also returning a vm.VolumeSnapshot
would make sense here.
pkg/cmd/roachtest/tests/admission_control_index_backfill.go
line 51 at r1 (raw file):
// TODO(irfansharif): Search by taking in the other parts of the // snapshot fingerprint, i.e. the node count, the version, etc. Name: snapshotPrefix,
Nit: Name
would be clearer as NamePrefix
pkg/cmd/roachtest/tests/admission_control_index_backfill.go
line 59 at r1 (raw file):
t.L().Printf("no existing snapshots found for %s (%s), doing pre-work", t.Name(), snapshotPrefix) // TODO(irfansharif): Add validation that we're some released // version, probably the predecessor one. Also ensure that any
This is where we'd do the "fixture" work to get the disk in a state to snapshot, right?
Doesn't it make sense to only c.ApplySnapshots(..)
when not in this branch (as opposed to always, below)
pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go
line 28 at r1 (raw file):
// (GCE) let you store volume snapshots in buckets with a pre-configured TTL. So // we use this nightly roachtest as a poor man's cron job. func registerPruneDanglingSnapshotsAndDisks(r registry.Registry) {
A nice idea that ultimately should belong elsewhere as part of the framework - perhaps in the test_runner, during clean up?
pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go
line 51 at r1 (raw file):
for _, snapshot := range snapshots { if err := c.DeleteSnapshots(ctx, snapshot); err != nil {
Why loop if the 2nd arg is variadic? A slice of err
could be returned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
drive-by review -- admission_control*.go files look good.
Reviewed 21 of 21 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @srosenberg)
pkg/cmd/roachtest/cluster/cluster_interface.go
line 143 at r1 (raw file):
// CreateSnapshot creates volume snapshots of the cluster using the given // prefix. These snapshots can later be retrieved, deleted or applied to // already instantiated clusters.
Maybe worth documenting the noop case for the local setup. The assumption this interface makes is that the roachtest that needed to do CreateSnapshot
will proceed with then using the already populated disk and not rely on the rest of these snapshot methods actually doing something.
pkg/cmd/roachtest/cluster/cluster_interface.go
line 154 at r1 (raw file):
// concerned - all already-attached volumes are detached and deleted to make // room for new snapshot-derived volumes. The new volumes are created using // the same specs (size, disk type, etc.) as the original cluster.
is this assumed to be one volume per node? Some of the nodes may not be running CockroachDB -- does that matter?
I see the implementation assumes equal nodes in the cluster. Worth adding to the code comment.
b1c58e7
to
5dbd1bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
@smg260, PTAL -- there a few more non admission_control*.go changes (around the tpce.go harness, and exposing top-level subcommands in roachprod) in addition to addressing your comments below.
@sumeerbhola, @bananabrick I fleshed out admission_control_index_backfill.go to be fully fledged. You could take another look. It's entirely optional for both y'all.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @smg260, @srosenberg, and @sumeerbhola)
pkg/cmd/roachtest/cluster/cluster_interface.go
line 143 at r1 (raw file):
Previously, sumeerbhola wrote…
Maybe worth documenting the noop case for the local setup. The assumption this interface makes is that the roachtest that needed to do
CreateSnapshot
will proceed with then using the already populated disk and not rely on the rest of these snapshot methods actually doing something.
Done.
pkg/cmd/roachtest/cluster/cluster_interface.go
line 144 at r1 (raw file):
Previously, smg260 (Miral Gadani) wrote…
Seems like also returning a
vm.VolumeSnapshot
would make sense here.
Done, great suggestion!
pkg/cmd/roachtest/cluster/cluster_interface.go
line 154 at r1 (raw file):
Previously, sumeerbhola wrote…
is this assumed to be one volume per node? Some of the nodes may not be running CockroachDB -- does that matter?
I see the implementation assumes equal nodes in the cluster. Worth adding to the code comment.
It's assuming one volume per node but can be changed, so added a TODO. Ditto for different kinds of volumes. It doesn't matter whether CRDB is running or not - it falls back to using "unknown" as the fingerprint version. But that's not desirable -- and I'll leave it to test authors to fiddle with it and improve safeguards. Also, CRDB can't be running, else we'd risk taking inconsistent disk snapshots cluster wide and cause badness. So the CRDB version is retrieved by looking at binary directly.
pkg/cmd/roachtest/tests/admission_control_index_backfill.go
line 51 at r1 (raw file):
Previously, smg260 (Miral Gadani) wrote…
Nit:
Name
would be clearer asNamePrefix
Done.
pkg/cmd/roachtest/tests/admission_control_index_backfill.go
line 59 at r1 (raw file):
Previously, smg260 (Miral Gadani) wrote…
This is where we'd do the "fixture" work to get the disk in a state to snapshot, right?
Doesn't it make sense to only
c.ApplySnapshots(..)
when not in this branch (as opposed to always, below)
Done. I'd written it that way originally to always exercise the apply-snapshot slow path, to increase robustness, but it just looks awkward.
pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go
line 28 at r1 (raw file):
Previously, smg260 (Miral Gadani) wrote…
A nice idea that ultimately should belong elsewhere as part of the framework - perhaps in the test_runner, during clean up?
Unless you insist strongly otherwise, I feel like this is better. These things add up to some non-trivial latency and the test runner clean up is invoked too frequently for this checking. Also it's a bit nice to be able to run this manually through just:
roachtest run prune-dangling --cockroach ./cockroach --cluster irfansharif-prune
And get specific logging for it.
pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go
line 51 at r1 (raw file):
Previously, smg260 (Miral Gadani) wrote…
Why loop if the 2nd arg is variadic? A slice of
err
could be returned.
It's just to get per-pruning logging below, which I've found useful. Slice of errors is unconventional and difficult to work with - you've got to figure out what indexes have succeeded and what haven't.
5dbd1bf
to
d6074df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @srosenberg, and @sumeerbhola)
pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go
line 28 at r1 (raw file):
Previously, irfansharif (irfan sharif) wrote…
Unless you insist strongly otherwise, I feel like this is better. These things add up to some non-trivial latency and the test runner clean up is invoked too frequently for this checking. Also it's a bit nice to be able to run this manually through just:
roachtest run prune-dangling --cockroach ./cockroach --cluster irfansharif-prune
And get specific logging for it.
Fair.
Not a strong opinion; I was alluding to a clean up phase at the end of running all tests and it was more of a general comment of how we could improve roachtest - definitely not for this PR.
d6074df
to
5ddb79b
Compare
Spoke to @smg260 offline -- running the nightly roachtest suite manually showed that our TC agents need updated GCE permissions to be able to run these gcloud snapshot/volume related operations. So checking in at skipped for now. Internal thread, https://cockroachlabs.atlassian.net/browse/DEVINF-768. |
Long-lived disk snapshots can drastically reduce testing time for scale tests. Tests, whether run by hand or through CI, need only run the long running fixture generating code (importing some dataset, generating it organically through workload, etc.) once snapshot fingerprints are changed, fingerprints that incorporate the major crdb version that generated them. Here's an example run that freshly generates disk snapshots: === RUN admission-control/index-backfill no existing snapshots found for admission-control/index-backfill (ac-index-backfill), doing pre-work created volume snapshot ac-index-backfill-0001-vunknown-1-n2-standard-8 for volume irfansharif-snapshot-0001-1 on irfansharif-snapshot-0001-1/n1 using 1 newly created snapshot(s) with prefix "ac-index-backfill" detached and deleted volume irfansharif-snapshot-0001-1 from irfansharif-snapshot-0001 created volume irfansharif-snapshot-0001-1 attached volume irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001 mounted irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001 --- PASS: admission-control/index-backfill (79.14s) Here's a subsequent run that makes use of the aforementioned disk snapshots: === RUN admission-control/index-backfill using 1 pre-existing snapshot(s) with prefix "ac-index-backfill" detached and deleted volume irfansharif-snapshot-0001-1 from irfansharif-snapshot-0001 created volume irfansharif-snapshot-0001-1 attached volume irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001 mounted irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001 --- PASS: admission-control/index-backfill (43.47s) We add the following APIs to the roachtest.Cluster interface, for tests to interact with disk snapshots. admission-control/index-backfill is an example test making use of these APIs. type Cluster interface { // ... // CreateSnapshot creates volume snapshots of the cluster using // the given prefix. These snapshots can later be retrieved, // deleted or applied to already instantiated clusters. CreateSnapshot(ctx context.Context, snapshotPrefix string) error // ListSnapshots lists the individual volume snapshots that // satisfy the search criteria. ListSnapshots( ctx context.Context, vslo vm.VolumeSnapshotListOpts, ) ([]vm.VolumeSnapshot, error) // DeleteSnapshots permanently deletes the given snapshots. DeleteSnapshots( ctx context.Context, snapshots ...vm.VolumeSnapshot, ) error // ApplySnapshots applies the given volume snapshots to the // underlying cluster. This is a destructive operation as far as // existing state is concerned - all already-attached volumes are // detached and deleted to make room for new snapshot-derived // volumes. The new volumes are created using the same specs // (size, disk type, etc.) as the original cluster. ApplySnapshots( ctx context.Context, snapshots []vm.VolumeSnapshot, ) error } These Cluster APIs are in turn is powered by the following additions to the vm.Provider interface, implemented by each cloud provider. GCE is the fully spec-ed out one for now. type Provider interface { // ... // CreateVolume creates a new volume using the given options. CreateVolume(l *logger.Logger, vco VolumeCreateOpts) (Volume, error) // ListVolumes lists all volumes already attached to the given VM. ListVolumes(l *logger.Logger, vm *VM) ([]Volume, error) // DeleteVolume detaches and deletes the given volume from the // given VM. DeleteVolume(l *logger.Logger, volume Volume, vm *VM) error // AttachVolume attaches the given volume to the given VM. AttachVolume(l *logger.Logger, volume Volume, vm *VM) (string, error) // CreateVolumeSnapshot creates a snapshot of the given volume, // using the given options. CreateVolumeSnapshot( l *logger.Logger, volume Volume, vsco VolumeSnapshotCreateOpts, ) (VolumeSnapshot, error) // ListVolumeSnapshots lists the individual volume snapshots that // satisfy the search criteria. ListVolumeSnapshots( l *logger.Logger, vslo VolumeSnapshotListOpts, ) ([]VolumeSnapshot, error) // DeleteVolumeSnapshot permanently deletes the given snapshot. DeleteVolumeSnapshot(l *logger.Logger, snapshot VolumeSnapshot) error } Since these snapshots necessarily outlive the tests, and we don't want them dangling perpetually, we introduce a prune-dangling roachtest that acts as a poor man's cron job, sifting through expired snapshots (>30days) and deleting them. For GCE at least it's not obvious to me how to create these snapshots in cloud buckets with a TTL built in, hence this hack. It looks like this (with change to the TTL): === RUN prune-dangling pruned old snapshot ac-index-backfill-0001-vunknown-1-n2-standard-8 --- PASS: prune-dangling (8.59s) --- We add expose some of these APIs through the roachprod binary directly. $ roachprod snapshot --help snapshot enables creating/listing/deleting/applying cluster snapshots Usage: roachprod snapshot [command] Available Commands: create snapshot a named cluster, using the given snapshot name and description list list all snapshots for the given cloud provider, optionally filtering by the given name delete delete all snapshots for the given cloud provider optionally filtering by the given name apply apply the named snapshots from the given cloud provider to the named cluster --- About admission-control/index-backfill. It's a fully featured test that uses the TPC-C 100k dataset and runs a foreground load for 20k customers. It takes >4hrs to import this data set; with disk snapshots this step is skipped entirely and takes a few minutes. The actual test is trivial, we run the foreground load for 1hr and run a large index backfill concurrently. Before cockroachdb#98308, this results in wild performance oscillations. It's still a bit wild after flow control, but less so. We slightly extend the tpc-e harness to make this happen, adding a few smarts: exposing a 'during' helper to run backfills concurrently with foreground load, integrate with --skip-init, estimated setup times, prometheus, and disk snapshots of course. Release note: None
5ddb79b
to
23110d7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity: OK to merge as is. @irfansharif to revisit once required permissions for TC agents are granted.
Reviewed 28 of 28 files at r4, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @srosenberg, and @sumeerbhola)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @irfansharif, @srosenberg, and @sumeerbhola)
Thanks! bors r+ |
Build succeeded: |
Support for disk snapshots has been recently added in [1]. This change adds minor fixes which were necessary to run in a different environment which was previously tested. Specifically, several commands were missing `--project` which resulted in (silent) failures. Error propagation from `c.Parallel` was signaling an internal error instead of passing command's error. (Granted, the existing mechanism for error propagation is to be revisited in [2].) [1] cockroachdb#103757 [2] cockroachdb#104312 Epic: none Release note: None
104573: roachprod: minor fixes for snapshots r=srosenberg a=srosenberg Support for disk snapshots has been recently added in [1]. This change adds minor fixes which were necessary to run in a different environment which was previously tested. Specifically, several commands were missing `--project` which resulted in (silent) failures. Error propagation from `c.Parallel` was signaling an internal error instead of passing command's error. (Granted, the existing mechanism for error propagation is to be revisited in [2].) [1] #103757 [2] #104312 Epic: none Release note: None Co-authored-by: Stan Rosenberg <[email protected]>
Part of #89978. Pre-cursor to #83826. Part of #98703.
Long-lived disk snapshots can drastically reduce testing time for scale tests. Tests, whether run by hand or through CI, need only run the long running fixture generating code (importing some dataset, generating it organically through workload, etc.) once snapshot fingerprints are changed, fingerprints that incorporate the major crdb version that generated them.
Here's an example run that freshly generates disk snapshots:
Here's a subsequent run that makes use of the aforementioned disk snapshots:
We add the following APIs to the roachtest.Cluster interface, for tests to interact with disk snapshots. admission-control/index-backfill is an example test making use of these APIs.
These Cluster APIs are in turn is powered by the following additions to the vm.Provider interface, implemented by each cloud provider. GCE is the fully spec-ed out one for now.
Since these snapshots necessarily outlive the tests, and we don't want them dangling perpetually, we introduce a prune-dangling roachtest that acts as a poor man's cron job, sifting through expired snapshots (>30days) and deleting them. For GCE at least it's not obvious to me how to create these snapshots in cloud buckets with a TTL built in, hence this hack. It looks like this (with change to the TTL):
We add expose some of these APIs through the roachprod binary directly.
About admission-control/index-backfill. It's a fully featured test that uses the TPC-C 100k dataset and runs a foreground load for 20k customers. It takes >4hrs to import this data set; with disk snapshots this step is skipped entirely and takes a few minutes. The actual test is trivial, we run the foreground load for 1hr and run a large index backfill concurrently. Before #98308, this results in wild performance oscillations. It's still a bit wild after flow control, but less so.
We slightly extend the tpc-e harness to make this happen, adding a few smarts: exposing a 'during' helper to run backfills concurrently with foreground load, integrate with --skip-init, estimated setup times, prometheus, and disk snapshots of course.
Release note: None