Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachprod: stage arm64 binary #103243

Merged
merged 1 commit into from
May 20, 2023

Conversation

srosenberg
Copy link
Member

Add --arch to override binary's architecture and refactor.
As of this change, roachprod stage is able to stage both
amd64 and arm64 on linux and darwin.
In conjunction with the previous change [1], roachprod
now uses arm64-based AMI for graviton2/graviton3 machines.

Below is an example of how to create a VM with graviton3,

roachprod create -n1 --clouds aws --aws-machine-type m7g.2xlarge --local-ssd=false $CRL_USERNAME-test
roachprod stage --arch arm64 $CRL_USERNAME-test release v23.1.0-rc.2
roachprod start $CRL_USERNAME-test

[1] #103236

Epic: none
Release note: None

@srosenberg srosenberg requested a review from a team as a code owner May 13, 2023 00:20
@srosenberg srosenberg requested review from herkolategan and smg260 and removed request for a team May 13, 2023 00:20
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@srosenberg srosenberg force-pushed the sr/roachprod_arm64_staging branch from 91babf6 to 740cb04 Compare May 13, 2023 00:29
@srosenberg srosenberg requested a review from rail May 13, 2023 00:47
@srosenberg srosenberg force-pushed the sr/roachprod_arm64_staging branch from 740cb04 to a7a4e6f Compare May 13, 2023 02:21
Copy link
Member

@rail rail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question inline.

pkg/cmd/roachprod/flags.go Show resolved Hide resolved
Copy link
Collaborator

@herkolategan herkolategan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 9 of 10 files at r1, 7 of 7 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail and @smg260)

@srosenberg srosenberg force-pushed the sr/roachprod_arm64_staging branch from a7a4e6f to 0f6ab46 Compare May 17, 2023 15:22
Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 7 files at r2, 11 of 11 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail and @srosenberg)


pkg/roachprod/roachprod.go line 519 at r3 (raw file):

	os := "linux"
	arch := "amd64"

this logic looks a little clunky - is it deliberate?
isLocal() as the first condition seems cleaner, with parameters always taking precedence

os := ...
arch := ...

if c.IsLocal() {

}

if stageOs != "" { .. override }
if stageArch != "" {.. override }

Copy link
Member Author

@srosenberg srosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail and @smg260)


pkg/roachprod/roachprod.go line 519 at r3 (raw file):

Previously, smg260 (Miral Gadani) wrote…

this logic looks a little clunky - is it deliberate?
isLocal() as the first condition seems cleaner, with parameters always taking precedence

os := ...
arch := ...

if c.IsLocal() {

}

if stageOs != "" { .. override }
if stageArch != "" {.. override }

Good catch! Technically, you should be allowed to stage an arch that's different from your local; e.g., running (emulated) amd64 on apple silicon. Either way, IsLocal should logically come first, and I'll add a warning if someone is attempting to stage an os/arch which differs from local.

Copy link
Contributor

@renatolabs renatolabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rail, @smg260, and @srosenberg)


pkg/roachprod/roachprod.go line 519 at r3 (raw file):

Previously, srosenberg (Stan Rosenberg) wrote…

Good catch! Technically, you should be allowed to stage an arch that's different from your local; e.g., running (emulated) amd64 on apple silicon. Either way, IsLocal should logically come first, and I'll add a warning if someone is attempting to stage an os/arch which differs from local.

Could these defaults also be coming from the command line? (i.e., StringVar(&stageArch, "arch", defaultArch, ...) where defaultArch = "amd64"? Then: 1) we don't need to check for command line overwrites; 2) --help is more explicit about the default values used; and 3) we don't need to support empty stageArch values (see comment on archInfoForOS).


pkg/roachprod/install/staging.go line 98 at r3 (raw file):

			return darwin_arm64_ArchInfo, nil
		}
		return darwin_x86_64_ArchInfo, nil

One less desirable property of this approach is that a typo (or even a genuinely unsupported arch) will silently use x86_64 (e.g., --arch amd63)

@srosenberg srosenberg force-pushed the sr/roachprod_arm64_staging branch from 0f6ab46 to f6aac57 Compare May 18, 2023 03:27
Copy link
Member Author

@srosenberg srosenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @herkolategan, @rail, @renatolabs, and @smg260)


pkg/roachprod/roachprod.go line 519 at r3 (raw file):

Could these defaults also be coming from the command line?

Sorta. The multi-value validation story for cobra is not great. Technically, you can define a custom pflag.Value interface, but that seems like an overkill. Instead, I borrowed the approach from the roachtest CLI which installs a custom validation function by using thePersistentPreRun hook.

That takes care of CLI validation, but recall that roachprod is also used as an API. archInfoForOS is used internally for staging, so added a validation bit there too. It's still not bullet-proof but maybe good enough, considering we currently don't validate a bunch of other args; i.e., to be revisited.


pkg/roachprod/install/staging.go line 98 at r3 (raw file):

Previously, renatolabs (Renato Costa) wrote…

One less desirable property of this approach is that a typo (or even a genuinely unsupported arch) will silently use x86_64 (e.g., --arch amd63)

Yep, added explicit precondition to validate supported values. I am not foreseeing other architectures we'd support in the near future, but you never know. If a new one does pop up, then we can refactor it to enum and use a custom pflag.Value (with shell autocompletion). For now, that seems like an overkill, would you agree?

@srosenberg srosenberg force-pushed the sr/roachprod_arm64_staging branch from f6aac57 to 1e98906 Compare May 18, 2023 04:15
Copy link
Contributor

@smg260 smg260 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 7 files at r2, 5 of 5 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @rail and @renatolabs)

@srosenberg srosenberg force-pushed the sr/roachprod_arm64_staging branch from 1e98906 to 1afbc37 Compare May 19, 2023 03:42
@srosenberg
Copy link
Member Author

After a final check, I discovered that the binary staging logic drifted from its source of truth (pkg/release). I've updated and tested all --os and --arch combinations. Additionally, I've added fips under --arch; it's a leaky abstraction, but ok for now. We'll clean it up in subsequent iterations.

PTAL!

Add `--arch` to override binary's architecture and refactor.
As of this change, `roachprod stage` is able to stage both
amd64 and arm64 on linux and darwin, as well as FIPS-enabled
binary built for amd64.
In conjunction with the previous change [1], roachprod
now uses arm64-based AMI for graviton2/graviton3 machines.

Below is an example of how to create a VM with graviton3,
```
roachprod create -n1 --clouds aws --aws-machine-type m7g.2xlarge --local-ssd=false $CRL_USERNAME-test
roachprod stage --arch arm64 $CRL_USERNAME-test release v23.1.0-rc.2
roachprod start $CRL_USERNAME-test
```

[1] cockroachdb#103236

Epic: none
Release note: None
@srosenberg srosenberg force-pushed the sr/roachprod_arm64_staging branch from 1afbc37 to 153ac47 Compare May 19, 2023 14:43
@srosenberg
Copy link
Member Author

TFTR! Next PR adds a few small improvements, so merging this one since it seems functionally correct.

bors r=rail,herkolategan,smg260

@craig
Copy link
Contributor

craig bot commented May 19, 2023

Build failed:

@srosenberg
Copy link
Member Author

bors retry

@craig
Copy link
Contributor

craig bot commented May 20, 2023

Build succeeded:

@craig craig bot merged commit a226099 into cockroachdb:master May 20, 2023
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 24, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
In case, a test is _not_ compatible with the chosen
configuration, its provisioning will fail. Thus, '1'
is typically used for manual (debug) runs.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 24, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
In case, a test is _not_ compatible with the chosen
configuration, its provisioning will fail. Thus, '1'
is typically used for manual (debug) runs.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 25, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 30, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 31, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request May 31, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 1, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 1, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 1, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
craig bot pushed a commit that referenced this pull request Jun 2, 2023
103710: roachtest: metamorphic ARM64 and FIPS clusters r=smg260,herkolategan a=srosenberg

Previously, all roachtests used (cloud) machine types with the AMD64 (cpu) architecture. Recently [1], new CI infrastructure was added to run a clone of all the nightly roachtests, configured with FIPS; i.e., same AMD64 machine types, different AMI and crdb binary, patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled via the two CLI args: `metamorphic-arm64-probability` and `metamorphic-fips-probability`. The former denotes the probability (over the uniform distribution) of a new cluster provisioned using ARM64 VMs. The latter denotes the probability of a new AMD64 cluster provisioned with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1. E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64 clusters are chosen ~40% of the time, whereas of the remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both 
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters 
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which 
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning for AWS in roachprod. We add ARM64 provisioning for GCE, i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter isn't formally a CPU architecture; however, it simplifies provisioning and binary staging.
We also modify roachprod.List to display CPU architecture, other than AMD64, with the machine type; this should make it easier to see which clusters are running ARM64 and FIPS configurations, as we ramp up their testing.

Epic: none
Release note: None

Resolves: #94957
Informs: #94986

[1] #99224
[2] #103243

Co-authored-by: Stan Rosenberg <[email protected]>
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 10, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
srosenberg added a commit to srosenberg/cockroach that referenced this pull request Jun 13, 2023
Previously, all roachtests used (cloud) machine types
with the AMD64 (cpu) architecture. Recently [1], new
CI infrastructure was added to run a clone of all the
nightly roachtests, configured with FIPS; i.e., same
AMD64 machine types, different AMI and crdb binary,
patched with FIPS-certified openssl native code.

As of this PR, we add the capability to execute any
roachtest in a cluster, configured with either
ARM64, FIPS, or AMD64 (default). This is controlled
via the two CLI args: `metamorphic-arm64-probability`
and `metamorphic-fips-probability`. The former denotes
the probability (over the uniform distribution) of a new
cluster provisioned using ARM64 VMs. The latter denotes
the probability of a new AMD64 cluster provisioned
with the FIPS-compliant (kernel) configuration.
In case a test is compatible only with AMD64, it's
effectively excluded from the set; i.e., both
probabilities apply to compatible tests only.

Note, the two probabilties don't have to add up to 1.
E.g., `metamorphic-arm64-probability==0.4`,
`metamorphic-fips-probability==0.2` denotes that ARM64
clusters are chosen ~40% of the time, whereas of the
remaining ~60% AMD clusters, FIPS is chosen ~20%
of the time; i.e., ~12% of all clusters will use FIPS.

Note, the values '0' and '1' are absolute. Setting both
to '0' is tantamount to the behavior before this PR.
Setting either to '1' enforces _all_ clusters
are provisioned with either ARM64 or FIPS.
A test can specify its required architecture, in which
case, it takes precedence over metamorphic settings.

This PR builds on [1], which enabled ARM64 provisioning
for AWS in roachprod. We add ARM64 provisioning for GCE,
i.e., T2A, as well as refactor 'arch' argument to
denote one of: AMD64, ARM64, FIPS, where the latter
isn't formally a CPU architecture; however, it simplifies
provisioning and binary staging.
We also modify roachprod.List to display CPU architecture,
other than AMD64, with the machine type; this should make it
easier to see which clusters are running ARM64 and FIPS
configurations, as we ramp up their testing.

The PR also adds validation to cockroach binaries and libs
to ensure we can execute tests under ARM64 and FIPS.
Furthermore, we add 'Enabled Assertions' header, generated
at build time, to the cockroach binary; the header is used
to validate whether or not the binary has runtime assertions
enabled.

Epic: none
Release note: None

Resolves: cockroachdb#94957
Resolves: cockroachdb#89268
Informs: cockroachdb#94986

[1] cockroachdb#99224
[2] cockroachdb#103243
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants