Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: prevent aws roachtest panic #100286

Merged
merged 1 commit into from
Mar 31, 2023

Conversation

msbutler
Copy link
Collaborator

After #99723 merged as a bandaid for #98783, the aws roachtest nightly began to panic because of a different roachtest papercut #96655. Specifically, because roachtest filters which tests run on which cloud within the evaluation of the test closure, tests meant to run on gce will still get registered in an AWS run. During the registration of the gce test
restore/tpce/400GB/gce/nodes=4/cpus=8/lowmem on aws, the aws test harness panics because the aws roachprod implementation does not have a low memory cpu configuration. This patch prevents this panic and should be reverted once the pr #99402 merges.

Epic: None

Release note: None

After cockroachdb#99723 merged as a bandaid for cockroachdb#98783, the aws roachtest nightly began to
panic because of a different roachtest papercut cockroachdb#96655. Specifically, because
roachtest filters which tests run on which cloud within the evaluation of the
test closure, tests meant to run on gce will still get registered in an AWS
run. During the registration of the gce test
`restore/tpce/400GB/gce/nodes=4/cpus=8/lowmem` _on aws_, the aws test harness
panics because the aws roachprod implementation does not have a low memory cpu
configuration. This patch prevents this panic and should be reverted once
the pr cockroachdb#99402 merges.

Epic: None

Release note: None
@msbutler msbutler added the T-testeng TestEng Team label Mar 31, 2023
@msbutler msbutler requested review from rail and srosenberg March 31, 2023 11:40
@msbutler msbutler self-assigned this Mar 31, 2023
@msbutler msbutler requested a review from a team as a code owner March 31, 2023 11:40
@msbutler msbutler requested review from smg260 and removed request for a team March 31, 2023 11:40
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@msbutler msbutler added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Mar 31, 2023
@rail
Copy link
Member

rail commented Mar 31, 2023

Let’s rerun the nightly after this one is merged.

@msbutler
Copy link
Collaborator Author

TFTR!

bors r=rail

@craig
Copy link
Contributor

craig bot commented Mar 31, 2023

Build succeeded:

@craig craig bot merged commit dcac33e into cockroachdb:master Mar 31, 2023
@msbutler msbutler deleted the butler-aws-panic branch March 31, 2023 16:20
@msbutler
Copy link
Collaborator Author

@@ -430,7 +430,7 @@ func (hw hardwareSpecs) makeClusterSpecs(r registry.Registry) spec.ClusterSpec {
}
s := r.MakeClusterSpec(hw.nodes, clusterOpts...)

if s.Cloud == "aws" && s.VolumeSize != 0 {
if backupCloud == spec.AWS && s.Cloud == spec.AWS && s.VolumeSize != 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second conjunct is a tautology because s.Cloud is empty in this case. Incidentally, none of the existing roachtests set s.Cloud to anything other than gce. 🤷

Copy link
Collaborator Author

@msbutler msbutler Apr 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srosenberg I don't believe that's the case, at least on master and 23.1 anymore. In an ad hoc test, I added a print statement in this branch to confirm we take it. We've also confirmed in later 23.1 restore roachtest failures, that we properly removed the d in the aws machine type specification. (see here).

Also, if s.Cloud were not set to aws, I don't think any aws roachtests would work? We seem to rely on this variable to set the right machine type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nm, you're right. I missed the fact that Cloud is passed (r.Cloud below) from the testRegistryImply, which in turns gets initialized before tests are registered,

spec.MakeClusterSpec(r.cloud, r.instanceType, nodeCount, finalOpts...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1 T-testeng TestEng Team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants