-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backupccl: introduce new restore roachtest framework #94143
Conversation
7eb4438
to
82eca9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @herkolategan, @msbutler, and @rhu713)
pkg/cmd/roachtest/tests/restore.go
line 415 at r1 (raw file):
func registerRestore(r registry.Registry) { // TODO(msbutler): delete old tests
Nit: seems a bit vague; e.g., I wouldn't necessarily know what the "old tests" are.
pkg/cmd/roachtest/tests/restore.go
line 716 at r1 (raw file):
m.Go(hc.Runner) // TODO(msbutler): export to prom/grafana
Meanwhile, you could start node_exporter
on each node (see prometheus.Init(...)
); this way we'll get system metrics in [1]; crdb metrics are already scraped. Note, this only works for GCE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, mostly small comments, the only meaningful one is about releases, not sure how this works when we introduce new releases. maybe you're leaving that for later which is fine.
77a4979
to
72101b1
Compare
72101b1
to
8dd0b44
Compare
This patch introduces a new framework for writing restore roachtests that minimizes code reuse and leverages our new backup fixture organization. The framework makes it easy to write a new test using a variety of knobs like: - hardware: cloud provider, disk volume, # of nodes, # of cpus - backup fixture: workload, workload scale The patch is the first in an ongoing effort to redo our roachtests, and introduces 3 new roachtests: - restore/tpce/400GB: the default configuration: 4 nodes, 8vcpus, 1000 GB EBS, restore a tpce backup fixture (25,000 customers, around 400 GB). - restore/tpce/400GB/gce: same config as above, run on gce. - restore/tpce/8TB/nodes=10: the big one! Notice that this patch also introduces a new naming convention for restore tests. The default test is named `restore/tpce/400GB` and only contains the basic workload. Each other test name will contain the workload and any specs which deviate from the default config. For example `restore/tpce/400GB/gce` only switches the cloud provider and holds all other variables constant; thus only the workload and 'gce' are needed in the name. Future patches will add more tests that use this framework. Informs cockroachdb#92699 Release note: None enforce naming convention
8dd0b44
to
4056275
Compare
TFTR!! bors r=lidorcarmel |
Build succeeded: |
In [1], new restore tests have been added to load fixtures from `s3://cockroach-fixtures`. Since the regional bucket is located in us-east-1, and roachprod provisions in us-east-2, egress during nightly and weekly runs amounted to non-trivial cost. This change replaces the bucket with the regional `s3://cockroach-fixtures-us-east-2`, which was replicated from us-east-1. As a precaution, other roachtest which depend on `s3://cockroach-fixtures` but do not currently run in AWS, are now guarded by a conditional skip. A new issue (cockroachdb#105968) has been created to address the problem of localizing fixtures under a given cloud provider. [1] cockroachdb#94143 Epic: none Informs: cockroachdb#105968 Release note: None
105969: roachtest: load AWS fixtures from us-east-2 to avoid egress r=herkolategan a=srosenberg In [1], new restore tests have been added to load fixtures from `s3://cockroach-fixtures`. Since the regional bucket is located in us-east-1, and roachprod provisions in us-east-2, egress during nightly and weekly runs amounted to non-trivial cost. This change replaces the bucket with the regional `s3://cockroach-fixtures-us-east-2`, which was replicated from us-east-1. As a precaution, other roachtest which depend on `s3://cockroach-fixtures` but do not currently run in AWS, are now guarded by a conditional skip. A new issue (#105968) has been created to address the problem of localizing fixtures under a given cloud provider. [1] #94143 Epic: none Informs: #105968 Release note: None Co-authored-by: Stan Rosenberg <[email protected]>
In [1], new restore tests have been added to load fixtures from `s3://cockroach-fixtures`. Since the regional bucket is located in us-east-1, and roachprod provisions in us-east-2, egress during nightly and weekly runs amounted to non-trivial cost. This change replaces the bucket with the regional `s3://cockroach-fixtures-us-east-2`, which was replicated from us-east-1. As a precaution, other roachtest which depend on `s3://cockroach-fixtures` but do not currently run in AWS, are now guarded by a conditional skip. A new issue (#105968) has been created to address the problem of localizing fixtures under a given cloud provider. [1] #94143 Epic: none Informs: #105968 Release note: None
In [1], new restore tests have been added to load fixtures from `s3://cockroach-fixtures`. Since the regional bucket is located in us-east-1, and roachprod provisions in us-east-2, egress during nightly and weekly runs amounted to non-trivial cost. This change replaces the bucket with the regional `s3://cockroach-fixtures-us-east-2`, which was replicated from us-east-1. As a precaution, other roachtest which depend on `s3://cockroach-fixtures` but do not currently run in AWS, are now guarded by a conditional skip. A new issue (cockroachdb#105968) has been created to address the problem of localizing fixtures under a given cloud provider. [1] cockroachdb#94143 Epic: none Informs: cockroachdb#105968 Release note: None
This patch introduces a new framework for writing restore roachtests that
minimizes code reuse and leverages our new backup fixture organization. The
framework makes it easy to write a new test using a variety of knobs like:
The patch is the first in an ongoing effort to redo our roachtests, and
introduces 3 new roachtests:
restore a tpce backup fixture (25,000 customers, around 400 GB).
Notice that this patch also introduces a new naming convention for restore
tests. The default test is named
restore/tpce/400GB
and only contains thebasic workload. Each other test name will contain the workload and any specs
which deviate from the default config. For example
restore/tpce/400GB/gce
only switches the cloud provider and holds all other variables constant; thus
only the workload and 'gce' are needed in the name.
Future patches will add more tests that use this framework.
Informs #92699
Release note: None