Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roach{prod,test}: add first-class support for disk snapshots #103757

Merged
merged 3 commits into from
May 26, 2023

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented May 23, 2023

Part of #89978. Pre-cursor to #83826. Part of #98703.

Long-lived disk snapshots can drastically reduce testing time for scale tests. Tests, whether run by hand or through CI, need only run the long running fixture generating code (importing some dataset, generating it organically through workload, etc.) once snapshot fingerprints are changed, fingerprints that incorporate the major crdb version that generated them.

Here's an example run that freshly generates disk snapshots:

=== RUN   admission-control/index-backfill
no existing snapshots found for admission-control/index-backfill (ac-index-backfill), doing pre-work
created volume snapshot ac-index-backfill-0001-vunknown-1-n2-standard-8 for volume irfansharif-snapshot-0001-1 on irfansharif-snapshot-0001-1/n1
using 1 newly created snapshot(s) with prefix "ac-index-backfill"
detached and deleted volume irfansharif-snapshot-0001-1 from irfansharif-snapshot-0001
created volume irfansharif-snapshot-0001-1
attached volume irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001
mounted irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001
--- PASS: admission-control/index-backfill (79.14s)

Here's a subsequent run that makes use of the aforementioned disk snapshots:

=== RUN   admission-control/index-backfill
using 1 pre-existing snapshot(s) with prefix "ac-index-backfill"
detached and deleted volume irfansharif-snapshot-0001-1 from irfansharif-snapshot-0001
created volume irfansharif-snapshot-0001-1
attached volume irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001
mounted irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001
--- PASS: admission-control/index-backfill (43.47s)

We add the following APIs to the roachtest.Cluster interface, for tests to interact with disk snapshots. admission-control/index-backfill is an example test making use of these APIs.

  type Cluster interface {
      // ...

      // CreateSnapshot creates volume snapshots of the cluster using
      // the given prefix. These snapshots can later be retrieved,
      // deleted or applied to already instantiated clusters.
      CreateSnapshot(ctx context.Context, snapshotPrefix string) error

      // ListSnapshots lists the individual volume snapshots that
      // satisfy the search criteria.
      ListSnapshots(
        ctx context.Context, vslo vm.VolumeSnapshotListOpts,
      ) ([]vm.VolumeSnapshot, error)

      // DeleteSnapshots permanently deletes the given snapshots.
      DeleteSnapshots(
        ctx context.Context, snapshots ...vm.VolumeSnapshot,
      ) error

      // ApplySnapshots applies the given volume snapshots to the
      // underlying cluster. This is a destructive operation as far as
      // existing state is concerned - all already-attached volumes are
      // detached and deleted to make room for new snapshot-derived
      // volumes. The new volumes are created using the same specs
      // (size, disk type, etc.) as the original cluster.
      ApplySnapshots(
        ctx context.Context, snapshots []vm.VolumeSnapshot,
      ) error
  }

These Cluster APIs are in turn is powered by the following additions to the vm.Provider interface, implemented by each cloud provider. GCE is the fully spec-ed out one for now.

  type Provider interface {
      // ...

      // CreateVolume creates a new volume using the given options.
      CreateVolume(l *logger.Logger, vco VolumeCreateOpts) (Volume, error)

      // ListVolumes lists all volumes already attached to the given VM.
      ListVolumes(l *logger.Logger, vm *VM) ([]Volume, error)

      // DeleteVolume detaches and deletes the given volume from the
      // given VM.
      DeleteVolume(l *logger.Logger, volume Volume, vm *VM) error

      // AttachVolume attaches the given volume to the given VM.
      AttachVolume(l *logger.Logger, volume Volume, vm *VM) (string, error)

      // CreateVolumeSnapshot creates a snapshot of the given volume,
      // using the given options.
      CreateVolumeSnapshot(
        l *logger.Logger, volume Volume, vsco VolumeSnapshotCreateOpts,
      ) (VolumeSnapshot, error)

      // ListVolumeSnapshots lists the individual volume snapshots that
      // satisfy the search criteria.
      ListVolumeSnapshots(
        l *logger.Logger, vslo VolumeSnapshotListOpts,
      ) ([]VolumeSnapshot, error)

      // DeleteVolumeSnapshot permanently deletes the given snapshot.
      DeleteVolumeSnapshot(l *logger.Logger, snapshot VolumeSnapshot) error
  }

Since these snapshots necessarily outlive the tests, and we don't want them dangling perpetually, we introduce a prune-dangling roachtest that acts as a poor man's cron job, sifting through expired snapshots (>30days) and deleting them. For GCE at least it's not obvious to me how to create these snapshots in cloud buckets with a TTL built in, hence this hack. It looks like this (with change to the TTL):

=== RUN   prune-dangling
pruned old snapshot ac-index-backfill-0001-vunknown-1-n2-standard-8
--- PASS: prune-dangling (8.59s)

We add expose some of these APIs through the roachprod binary directly.

$ roachprod snapshot --help
  snapshot enables creating/listing/deleting/applying cluster snapshots

  Usage:
    roachprod snapshot [command]

  Available Commands:
    create      snapshot a named cluster, using the given snapshot name and description
    list        list all snapshots for the given cloud provider, optionally filtering by the given name
    delete      delete all snapshots for the given cloud provider optionally filtering by the given name
    apply       apply the named snapshots from the given cloud provider to the named cluster

About admission-control/index-backfill. It's a fully featured test that uses the TPC-C 100k dataset and runs a foreground load for 20k customers. It takes >4hrs to import this data set; with disk snapshots this step is skipped entirely and takes a few minutes. The actual test is trivial, we run the foreground load for 1hr and run a large index backfill concurrently. Before #98308, this results in wild performance oscillations. It's still a bit wild after flow control, but less so.

We slightly extend the tpc-e harness to make this happen, adding a few smarts: exposing a 'during' helper to run backfills concurrently with foreground load, integrate with --skip-init, estimated setup times, prometheus, and disk snapshots of course.

Release note: None

@irfansharif irfansharif requested review from sumeerbhola and a team May 23, 2023 04:19
@irfansharif irfansharif requested a review from a team as a code owner May 23, 2023 04:19
@irfansharif irfansharif requested review from srosenberg and smg260 and removed request for a team May 23, 2023 04:19
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Pure code movement. We'll make use of it outside this file in subsequent
commits.

Release note: None
These tests have been stable for a few months now. Reduce to a weekly
cadence.

Release note: None
@irfansharif irfansharif force-pushed the 230507.roachtest-snapshots branch 4 times, most recently from 0a8ba7a to f13b01d Compare May 23, 2023 13:23
Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Doubtlessly will result in significant time (and cost) savings. Just minor observations/questions from me.

Reviewed 2 of 21 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @srosenberg, and @sumeerbhola)


pkg/cmd/roachtest/cluster/cluster_interface.go line 144 at r1 (raw file):

	// prefix. These snapshots can later be retrieved, deleted or applied to
	// already instantiated clusters.
	CreateSnapshot(ctx context.Context, snapshotPrefix string) error

Seems like also returning a vm.VolumeSnapshot would make sense here.


pkg/cmd/roachtest/tests/admission_control_index_backfill.go line 51 at r1 (raw file):

				// TODO(irfansharif): Search by taking in the other parts of the
				// snapshot fingerprint, i.e. the node count, the version, etc.
				Name: snapshotPrefix,

Nit: Name would be clearer as NamePrefix


pkg/cmd/roachtest/tests/admission_control_index_backfill.go line 59 at r1 (raw file):

				t.L().Printf("no existing snapshots found for %s (%s), doing pre-work", t.Name(), snapshotPrefix)
				// TODO(irfansharif): Add validation that we're some released
				// version, probably the predecessor one. Also ensure that any

This is where we'd do the "fixture" work to get the disk in a state to snapshot, right?

Doesn't it make sense to only c.ApplySnapshots(..) when not in this branch (as opposed to always, below)


pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go line 28 at r1 (raw file):

// (GCE) let you store volume snapshots in buckets with a pre-configured TTL. So
// we use this nightly roachtest as a poor man's cron job.
func registerPruneDanglingSnapshotsAndDisks(r registry.Registry) {

A nice idea that ultimately should belong elsewhere as part of the framework - perhaps in the test_runner, during clean up?


pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go line 51 at r1 (raw file):

			for _, snapshot := range snapshots {
				if err := c.DeleteSnapshots(ctx, snapshot); err != nil {

Why loop if the 2nd arg is variadic? A slice of err could be returned.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drive-by review -- admission_control*.go files look good.

Reviewed 21 of 21 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @srosenberg)


pkg/cmd/roachtest/cluster/cluster_interface.go line 143 at r1 (raw file):

	// CreateSnapshot creates volume snapshots of the cluster using the given
	// prefix. These snapshots can later be retrieved, deleted or applied to
	// already instantiated clusters.

Maybe worth documenting the noop case for the local setup. The assumption this interface makes is that the roachtest that needed to do CreateSnapshot will proceed with then using the already populated disk and not rely on the rest of these snapshot methods actually doing something.


pkg/cmd/roachtest/cluster/cluster_interface.go line 154 at r1 (raw file):

	// concerned - all already-attached volumes are detached and deleted to make
	// room for new snapshot-derived volumes. The new volumes are created using
	// the same specs (size, disk type, etc.) as the original cluster.

is this assumed to be one volume per node? Some of the nodes may not be running CockroachDB -- does that matter?

I see the implementation assumes equal nodes in the cluster. Worth adding to the code comment.

@irfansharif irfansharif force-pushed the 230507.roachtest-snapshots branch 2 times, most recently from b1c58e7 to 5dbd1bf Compare May 25, 2023 20:35
Copy link
Contributor Author

@irfansharif irfansharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

@smg260, PTAL -- there a few more non admission_control*.go changes (around the tpce.go harness, and exposing top-level subcommands in roachprod) in addition to addressing your comments below.

@sumeerbhola, @bananabrick I fleshed out admission_control_index_backfill.go to be fully fledged. You could take another look. It's entirely optional for both y'all.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @smg260, @srosenberg, and @sumeerbhola)


pkg/cmd/roachtest/cluster/cluster_interface.go line 143 at r1 (raw file):

Previously, sumeerbhola wrote…

Maybe worth documenting the noop case for the local setup. The assumption this interface makes is that the roachtest that needed to do CreateSnapshot will proceed with then using the already populated disk and not rely on the rest of these snapshot methods actually doing something.

Done.


pkg/cmd/roachtest/cluster/cluster_interface.go line 144 at r1 (raw file):

Previously, smg260 (Miral Gadani) wrote…

Seems like also returning a vm.VolumeSnapshot would make sense here.

Done, great suggestion!


pkg/cmd/roachtest/cluster/cluster_interface.go line 154 at r1 (raw file):

Previously, sumeerbhola wrote…

is this assumed to be one volume per node? Some of the nodes may not be running CockroachDB -- does that matter?

I see the implementation assumes equal nodes in the cluster. Worth adding to the code comment.

It's assuming one volume per node but can be changed, so added a TODO. Ditto for different kinds of volumes. It doesn't matter whether CRDB is running or not - it falls back to using "unknown" as the fingerprint version. But that's not desirable -- and I'll leave it to test authors to fiddle with it and improve safeguards. Also, CRDB can't be running, else we'd risk taking inconsistent disk snapshots cluster wide and cause badness. So the CRDB version is retrieved by looking at binary directly.


pkg/cmd/roachtest/tests/admission_control_index_backfill.go line 51 at r1 (raw file):

Previously, smg260 (Miral Gadani) wrote…

Nit: Name would be clearer as NamePrefix

Done.


pkg/cmd/roachtest/tests/admission_control_index_backfill.go line 59 at r1 (raw file):

Previously, smg260 (Miral Gadani) wrote…

This is where we'd do the "fixture" work to get the disk in a state to snapshot, right?

Doesn't it make sense to only c.ApplySnapshots(..) when not in this branch (as opposed to always, below)

Done. I'd written it that way originally to always exercise the apply-snapshot slow path, to increase robustness, but it just looks awkward.


pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go line 28 at r1 (raw file):

Previously, smg260 (Miral Gadani) wrote…

A nice idea that ultimately should belong elsewhere as part of the framework - perhaps in the test_runner, during clean up?

Unless you insist strongly otherwise, I feel like this is better. These things add up to some non-trivial latency and the test runner clean up is invoked too frequently for this checking. Also it's a bit nice to be able to run this manually through just:

roachtest run prune-dangling --cockroach ./cockroach --cluster irfansharif-prune

And get specific logging for it.


pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go line 51 at r1 (raw file):

Previously, smg260 (Miral Gadani) wrote…

Why loop if the 2nd arg is variadic? A slice of err could be returned.

It's just to get per-pruning logging below, which I've found useful. Slice of errors is unconventional and difficult to work with - you've got to figure out what indexes have succeeded and what haven't.

Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @srosenberg, and @sumeerbhola)


pkg/cmd/roachtest/tests/prune_dangling_snapshots_and_disks.go line 28 at r1 (raw file):

Previously, irfansharif (irfan sharif) wrote…

Unless you insist strongly otherwise, I feel like this is better. These things add up to some non-trivial latency and the test runner clean up is invoked too frequently for this checking. Also it's a bit nice to be able to run this manually through just:

roachtest run prune-dangling --cockroach ./cockroach --cluster irfansharif-prune

And get specific logging for it.

Fair.

Not a strong opinion; I was alluding to a clean up phase at the end of running all tests and it was more of a general comment of how we could improve roachtest - definitely not for this PR.

@irfansharif irfansharif force-pushed the 230507.roachtest-snapshots branch from d6074df to 5ddb79b Compare May 26, 2023 20:30
@irfansharif
Copy link
Contributor Author

irfansharif commented May 26, 2023

Spoke to @smg260 offline -- running the nightly roachtest suite manually showed that our TC agents need updated GCE permissions to be able to run these gcloud snapshot/volume related operations. So checking in at skipped for now. Internal thread, https://cockroachlabs.atlassian.net/browse/DEVINF-768.

Long-lived disk snapshots can drastically reduce testing time for scale
tests. Tests, whether run by hand or through CI, need only run the
long running fixture generating code (importing some dataset, generating
it organically through workload, etc.) once snapshot fingerprints are
changed, fingerprints that incorporate the major crdb version that
generated them.

Here's an example run that freshly generates disk snapshots:

    === RUN   admission-control/index-backfill
    no existing snapshots found for admission-control/index-backfill (ac-index-backfill), doing pre-work
    created volume snapshot ac-index-backfill-0001-vunknown-1-n2-standard-8 for volume irfansharif-snapshot-0001-1 on irfansharif-snapshot-0001-1/n1
    using 1 newly created snapshot(s) with prefix "ac-index-backfill"
    detached and deleted volume irfansharif-snapshot-0001-1 from irfansharif-snapshot-0001
    created volume irfansharif-snapshot-0001-1
    attached volume irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001
    mounted irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001
    --- PASS: admission-control/index-backfill (79.14s)

Here's a subsequent run that makes use of the aforementioned disk
snapshots:

    === RUN   admission-control/index-backfill
    using 1 pre-existing snapshot(s) with prefix "ac-index-backfill"
    detached and deleted volume irfansharif-snapshot-0001-1 from irfansharif-snapshot-0001
    created volume irfansharif-snapshot-0001-1
    attached volume irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001
    mounted irfansharif-snapshot-0001-1 to irfansharif-snapshot-0001
    --- PASS: admission-control/index-backfill (43.47s)

We add the following APIs to the roachtest.Cluster interface, for tests
to interact with disk snapshots. admission-control/index-backfill is an
example test making use of these APIs.

  type Cluster interface {
      // ...

      // CreateSnapshot creates volume snapshots of the cluster using
      // the given prefix. These snapshots can later be retrieved,
      // deleted or applied to already instantiated clusters.
      CreateSnapshot(ctx context.Context, snapshotPrefix string) error

      // ListSnapshots lists the individual volume snapshots that
      // satisfy the search criteria.
      ListSnapshots(
        ctx context.Context, vslo vm.VolumeSnapshotListOpts,
      ) ([]vm.VolumeSnapshot, error)

      // DeleteSnapshots permanently deletes the given snapshots.
      DeleteSnapshots(
        ctx context.Context, snapshots ...vm.VolumeSnapshot,
      ) error

      // ApplySnapshots applies the given volume snapshots to the
      // underlying cluster. This is a destructive operation as far as
      // existing state is concerned - all already-attached volumes are
      // detached and deleted to make room for new snapshot-derived
      // volumes. The new volumes are created using the same specs
      // (size, disk type, etc.) as the original cluster.
      ApplySnapshots(
        ctx context.Context, snapshots []vm.VolumeSnapshot,
      ) error
  }

These Cluster APIs are in turn is powered by the following additions to
the vm.Provider interface, implemented by each cloud provider. GCE is
the fully spec-ed out one for now.

  type Provider interface {
      // ...

      // CreateVolume creates a new volume using the given options.
      CreateVolume(l *logger.Logger, vco VolumeCreateOpts) (Volume, error)

      // ListVolumes lists all volumes already attached to the given VM.
      ListVolumes(l *logger.Logger, vm *VM) ([]Volume, error)

      // DeleteVolume detaches and deletes the given volume from the
      // given VM.
      DeleteVolume(l *logger.Logger, volume Volume, vm *VM) error

      // AttachVolume attaches the given volume to the given VM.
      AttachVolume(l *logger.Logger, volume Volume, vm *VM) (string, error)

      // CreateVolumeSnapshot creates a snapshot of the given volume,
      // using the given options.
      CreateVolumeSnapshot(
        l *logger.Logger, volume Volume, vsco VolumeSnapshotCreateOpts,
      ) (VolumeSnapshot, error)

      // ListVolumeSnapshots lists the individual volume snapshots that
      // satisfy the search criteria.
      ListVolumeSnapshots(
        l *logger.Logger, vslo VolumeSnapshotListOpts,
      ) ([]VolumeSnapshot, error)

      // DeleteVolumeSnapshot permanently deletes the given snapshot.
      DeleteVolumeSnapshot(l *logger.Logger, snapshot VolumeSnapshot) error
  }

Since these snapshots necessarily outlive the tests, and we don't want
them dangling perpetually, we introduce a prune-dangling roachtest that
acts as a poor man's cron job, sifting through expired snapshots
(>30days) and deleting them. For GCE at least it's not obvious to me how
to create these snapshots in cloud buckets with a TTL built in, hence
this hack. It looks like this (with change to the TTL):

    === RUN   prune-dangling
    pruned old snapshot ac-index-backfill-0001-vunknown-1-n2-standard-8
    --- PASS: prune-dangling (8.59s)

---

We add expose some of these APIs through the roachprod binary directly.

$ roachprod snapshot --help
  snapshot enables creating/listing/deleting/applying cluster snapshots

  Usage:
    roachprod snapshot [command]

  Available Commands:
    create      snapshot a named cluster, using the given snapshot name and description
    list        list all snapshots for the given cloud provider, optionally filtering by the given name
    delete      delete all snapshots for the given cloud provider optionally filtering by the given name
    apply       apply the named snapshots from the given cloud provider to the named cluster

---

About admission-control/index-backfill. It's a fully featured test that
uses the TPC-C 100k dataset and runs a foreground load for 20k
customers. It takes >4hrs to import this data set; with disk snapshots
this step is skipped entirely and takes a few minutes. The actual test
is trivial, we run the foreground load for 1hr and run a large index
backfill concurrently. Before cockroachdb#98308, this results in wild performance
oscillations. It's still a bit wild after flow control, but less so.

We slightly extend the tpc-e harness to make this happen, adding a few
smarts: exposing a 'during' helper to run backfills concurrently with
foreground load, integrate with --skip-init, estimated setup times,
prometheus, and disk snapshots of course.

Release note: None
@irfansharif irfansharif force-pushed the 230507.roachtest-snapshots branch from 5ddb79b to 23110d7 Compare May 26, 2023 20:35
Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity: OK to merge as is. @irfansharif to revisit once required permissions for TC agents are granted.

Reviewed 28 of 28 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @srosenberg, and @sumeerbhola)

Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @irfansharif, @srosenberg, and @sumeerbhola)

@irfansharif
Copy link
Contributor Author

Thanks!

bors r+

@craig
Copy link
Contributor

craig bot commented May 26, 2023

Build succeeded:

@craig craig bot merged commit bd04744 into cockroachdb:master May 26, 2023
@irfansharif irfansharif deleted the 230507.roachtest-snapshots branch May 26, 2023 21:21
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 8, 2023
Support for disk snapshots has been recently added in [1].
This change adds minor fixes which were necessary to
run in a different environment which was previously tested.
Specifically, several commands were missing `--project`
which resulted in (silent) failures. Error propagation
from `c.Parallel` was signaling an internal error
instead of passing command's error. (Granted, the existing
mechanism for error propagation is to be revisited in [2].)

[1] cockroachdb#103757
[2] cockroachdb#104312

Epic: none

Release note: None
craig bot pushed a commit that referenced this pull request Jun 9, 2023
104573: roachprod: minor fixes for snapshots r=srosenberg a=srosenberg

Support for disk snapshots has been recently added in [1]. This change adds minor fixes which were necessary to run in a different environment which was previously tested. Specifically, several commands were missing `--project` which resulted in (silent) failures. Error propagation from `c.Parallel` was signaling an internal error
instead of passing command's error. (Granted, the existing mechanism for error propagation is to be revisited in [2].)

[1] #103757
[2] #104312

Epic: none

Release note: None

Co-authored-by: Stan Rosenberg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants