Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] roachprod: don't use RAID0 by default #98782

Closed
wants to merge 1 commit into from

Conversation

tbg
Copy link
Member

@tbg tbg commented Mar 16, 2023

This is a WIP because the behavior when machine types with local SSD are
used is unclear. For example, on AWS, roachtest prefers the c5d family,
which all come with local SST storage. But looking into
awsStartupScriptTemplate, it seems unclear how to make sure that the
EBS disk(s) get mounted as /mnt/data1 (which is probably what the
default should be).

We could also entertain straight-up preventing combinations that would
lead to an inhomogeneous RAID0. I imagine we'd have to take a round of
failures to find all of the places in which it happens, but perhaps
a "snitch" can be inserted instead so that we can detect all such
callers and fix them up before arming the check.

By the way, EBS disks on AWS come with a default of 125mb/s which is
less than this RAID0 gets "most of the time" - so we can expect some
tests to behave differently after this change. I still believe this
is worth it - debugging is so much harder when you're on top of a
storage that's hard to predict and doesn't resemble any production
deployment.


I wasted weeks of my life on this before, and it almost happened again!
When you run a roachtest that asks for an AWS cXd machine (i.e. compute
optimized with NVMe local disk), and you specify a VolumeSize, you also
get an EBS volume. Prior to these commit, these would be RAID0'ed
together.

This isn't something sane - the resulting gp3 EBS volume is very
different from the local NVMe volume in every way, and it lead to
hard-to-understand write throughput behavior.

This commit defaults to not using RAID0.

Touches #98767.
Touches #98576.
Touches #97019.

Epic: none
Release note: None

This is a WIP because the behavior when machine types with local SSD are
used is unclear. For example, on AWS, roachtest prefers the c5d family,
which all come with local SST storage. But looking into
`awsStartupScriptTemplate`, it seems unclear how to make sure that the
EBS disk(s) get mounted as /mnt/data1 (which is probably what the
default should be).

We could also entertain straight-up preventing combinations that would
lead to an inhomogeneous RAID0. I imagine we'd have to take a round of
failures to find all of the places in which it happens, but perhaps
a "snitch" can be inserted instead so that we can detect all such
callers and fix them up before arming the check.

By the way, EBS disks on AWS come with a default of 125mb/s which is
less than this RAID0 gets "most of the time" - so we can expect some
tests to behave differently after this change. I still believe this
is worth it - debugging is so much harder when you're on top of a
storage that's hard to predict and doesn't resemble any production
deployment.

----

I wasted weeks of my life on this before, and it almost happened again!
When you run a roachtest that asks for an AWS cXd machine (i.e. compute
optimized with NVMe local disk), and you specify a VolumeSize, you also
get an EBS volume. Prior to these commit, these would be RAID0'ed
together.

This isn't something sane - the resulting gp3 EBS volume is very
different from the local NVMe volume in every way, and it lead to
hard-to-understand write throughput behavior.

This commit defaults to *not* using RAID0.

Touches cockroachdb#98767.
Touches cockroachdb#98576.
Touches cockroachdb#97019.

Epic: none
Release note: None
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg
Copy link
Member Author

tbg commented Mar 16, 2023

Another thing and maybe easier, we can make sure we don't use the aws .*d series if we're provisioning EBS from roachtest. I still think that would leave some users of roachprod directly with these random combos though, so a full cleanup would be preferable IMO

@srosenberg
Copy link
Member

srosenberg commented Mar 20, 2023

But looking into
awsStartupScriptTemplate, it seems unclear how to make sure that the
EBS disk(s) get mounted as /mnt/data1 (which is probably what the
default should be).

Using nvme-cli [1], we can enumerate all local and remote NVMe drives in AWS. The same works in GCE. E.g., on m5d.4xlarge, we have two local ones, and two remotes. (Of the remotes, the smaller is the boot disk.)

nvme list
Node             SN                   Model                                    Namespace Usage                      Format           FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1     vol09376b98d3921323e Amazon Elastic Block Store               1          10.74  GB /  10.74  GB    512   B +  0 B   1.0
/dev/nvme1n1     vol0dc293fa2da0b84c1 Amazon Elastic Block Store               1         536.87  GB / 536.87  GB    512   B +  0 B   1.0
/dev/nvme2n1     AWS22C7269994BB102B9 Amazon EC2 NVMe Instance Storage         1         300.00  GB / 300.00  GB    512   B +  0 B   0
/dev/nvme3n1     AWS1313D76F7559718BD Amazon EC2 NVMe Instance Storage         1         300.00  GB / 300.00  GB    512   B +  0 B   0

We could also entertain straight-up preventing combinations that would
lead to an inhomogeneous RAID0. I imagine we'd have to take a round of
failures to find all of the places in which it happens, but perhaps
a "snitch" can be inserted instead so that we can detect all such
callers and fix them up before arming the check.

I think it's reasonable to default to RAID0 only local NVMes. In the above example, we would auto-RAID0 the two local NVMes and warn the user that the remaining remote disk remains unused [2], [3]. E.g.,

roachprod create -c aws -n 1  --aws-machine-type m5d.4xlarge --local-ssd=false stan-test

prints two warning messages,

using local NVMes with multiple EBS volumes without --aws-use-multiple-disks
using local NVMes without --aws-use-local-ssd

before the VM is created. (See validateMachineType [4].) The startup script also appends the warning message to /etc/motd. Of course, we could make roachprod more strict by disallowing "dangling" (i.e., unmounted) storage volumes.

[1] https://github.com/linux-nvme/nvme-cli
[2] srosenberg@a7d3e0c#diff-a4da7651b7fb1760ae7195d196f1f4931896fd26590d0a617d9dfce2417ca35cR973
[3] srosenberg@a7d3e0c#diff-089575e95dbc289d33878eabfd55cb4bad8c49e2e44ac931fcedeecbb0264c12R120-R131
[4] srosenberg@a7d3e0c#diff-a4da7651b7fb1760ae7195d196f1f4931896fd26590d0a617d9dfce2417ca35cR920

@tbg
Copy link
Member Author

tbg commented Mar 20, 2023

I think it's reasonable to default to RAID0 only local NVMes. In the above example, we would auto-RAID0 the two local NVMes and warn the user that the remaining remote disk remains unused [2], [3]. E.g.,

roachtest for AWS currently always uses XXXd machines, so there's always a local nvme instance and we do use this by default. However, in some tests it expects to use the networked disks, because those are actually the ones that have the size requested by the tests. From that point of view it would be better to default to using the networked disk if one is available, and ignore the NVMe disk, or make sure roachprod doesn't let us create both (but this could paint us in a corner if there are VMs that only come with local disk).

@tbg
Copy link
Member Author

tbg commented Mar 21, 2023

Closing this PR since I'm not working on it, we're tracking this issue in #98783.

For the test that I've been looking at, I'll go with a hack: #98767 until we have a real fix.

@tbg tbg closed this Mar 21, 2023
@tbg tbg deleted the roachprod-avoid-raid0 branch March 21, 2023 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants