Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logictest,backupccl: enable 3node-backup config in a nightly job #77130

Closed
adityamaru opened this issue Feb 28, 2022 · 2 comments
Closed

logictest,backupccl: enable 3node-backup config in a nightly job #77130

adityamaru opened this issue Feb 28, 2022 · 2 comments
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) stability-period-v22.2 T-disaster-recovery

Comments

@adityamaru
Copy link
Contributor

adityamaru commented Feb 28, 2022

In #74174 we added a 3node-backup configuration to logictests that can be used to randomly run a cluster backup between lines in an existing logictest, followed by a cluster restore thereby allowing the test to proceed as before. This infrastructure is currently disabled since COCKROACH_LOGIC_TEST_BACKUP_RESTORE_PROBABILITY is set to 0.

The motivation of this issue is to actually enable this configuration. While this is presumably going to be slow to run on regular CI, we can write a special nightly job that invokes this configuration on existing logictests with varying probability. It is also unlikely that we will be able to run all logic tests every night, and so some memoization where we run a different subset of all tests every night so as to eventually get complete coverage would be nice.

Epic CRDB-31689
Jira issue: CRDB-13426

@adityamaru adityamaru added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery labels Feb 28, 2022
@adityamaru adityamaru self-assigned this Feb 28, 2022
@blathers-crl
Copy link

blathers-crl bot commented Feb 28, 2022

cc @cockroachdb/bulk-io

@adityamaru adityamaru added the stability-period Intended to be worked on during a stability period. Use with the Milestone field to specify version. label Aug 3, 2022
adityamaru added a commit to adityamaru/cockroach that referenced this issue Sep 13, 2022
…ogic tests

This change introduces a nightly script that run the logictests
in logictestccl to run with `COCKROACH_LOGIC_TEST_BACKUP_RESTORE_PROBABILITY`
set to a non-zero value. This environment variable controls the probability of us
running a backup+restore between lines of a logic test. Egs: 1.0 means that
every line of the logic test will be run after a cluster backup and restore.

The motivation for this change is to increase the coverage of backup+restore
testing by piggybacking on our large logictest corpus. To begin with we only
run tests in logictestccl with config local, but this will be gradually expanded
as we work through the failure modes.

Informs: cockroachdb#77130

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Sep 19, 2022
…ogic tests

This change introduces a nightly script that run the logictests
in logictestccl to run with `COCKROACH_LOGIC_TEST_BACKUP_RESTORE_PROBABILITY`
set to a non-zero value. This environment variable controls the probability of us
running a backup+restore between lines of a logic test. Egs: 1.0 means that
every line of the logic test will be run after a cluster backup and restore.

The motivation for this change is to increase the coverage of backup+restore
testing by piggybacking on our large logictest corpus. To begin with we only
run tests in logictestccl with config local, but this will be gradually expanded
as we work through the failure modes.

Informs: cockroachdb#77130

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Sep 20, 2022
…ogic tests

This change introduces a nightly script that run the logictests
in logictestccl to run with `COCKROACH_LOGIC_TEST_BACKUP_RESTORE_PROBABILITY`
set to a non-zero value. This environment variable controls the probability of us
running a backup+restore between lines of a logic test. Egs: 1.0 means that
every line of the logic test will be run after a cluster backup and restore.

The motivation for this change is to increase the coverage of backup+restore
testing by piggybacking on our large logictest corpus. To begin with we only
run tests in logictestccl with config local, but this will be gradually expanded
as we work through the failure modes.

Informs: cockroachdb#77130

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Sep 20, 2022
…ogic tests

This change introduces a nightly script that run the logictests
in logictestccl to run with `COCKROACH_LOGIC_TEST_BACKUP_RESTORE_PROBABILITY`
set to a non-zero value. This environment variable controls the probability of us
running a backup+restore between lines of a logic test. Egs: 1.0 means that
every line of the logic test will be run after a cluster backup and restore.

The motivation for this change is to increase the coverage of backup+restore
testing by piggybacking on our large logictest corpus. To begin with we only
run tests in logictestccl with config local, but this will be gradually expanded
as we work through the failure modes.

Informs: cockroachdb#77130

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Sep 21, 2022
…ogic tests

This change introduces a nightly script that run the logictests
in logictestccl to run with `COCKROACH_LOGIC_TEST_BACKUP_RESTORE_PROBABILITY`
set to a non-zero value. This environment variable controls the probability of us
running a backup+restore between lines of a logic test. Egs: 1.0 means that
every line of the logic test will be run after a cluster backup and restore.

The motivation for this change is to increase the coverage of backup+restore
testing by piggybacking on our large logictest corpus. To begin with we only
run tests in logictestccl with config local, but this will be gradually expanded
as we work through the failure modes.

Informs: cockroachdb#77130

Release note: None
craig bot pushed a commit that referenced this issue Sep 22, 2022
87918: nightlies,logictestccl: probabilistically run backup+restore during logic tests r=stevendanna a=adityamaru

This change introduces a nightly script that run the logictests
in logictestccl to run with `COCKROACH_LOGIC_TEST_BACKUP_RESTORE_PROBABILITY`
set to a non-zero value. This environment variable controls the probability of us
running a backup+restore between lines of a logic test. Egs: 1.0 means that
every line of the logic test will be run after a cluster backup and restore.

The motivation for this change is to increase the coverage of backup+restore
testing by piggybacking on our large logictest corpus. To begin with we only
run tests in logictestccl with config local, but this will be gradually expanded
as we work through the failure modes.

Informs: #77130

Release note: None

Co-authored-by: adityamaru <[email protected]>
@exalate-issue-sync exalate-issue-sync bot added stability-period-v22.2 and removed stability-period Intended to be worked on during a stability period. Use with the Milestone field to specify version. labels Mar 30, 2023
@adityamaru adityamaru self-assigned this Sep 28, 2023
@adityamaru
Copy link
Contributor Author

I'm going to take another stab at this during stability, and see how we can maximize stability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) stability-period-v22.2 T-disaster-recovery
Projects
No open projects
Development

No branches or pull requests

1 participant