kv: build a reproduction for a column backfill impacting foreground traffic #82700

lunevalex · 2022-06-10T00:00:38Z

In a recent customer escalation we observed that a column backfill could cause a significant latency impact on a cluster that is running close to it's theoretical disk bandwidth limit. The backfill saturates the disk, latency climbs, admission control kicks in and the application traffic has a performance hit.

We should build a roachtest or benchmark to reproduce this type of behavior, so we can that use that as a platform to test and develop fixes. One such fix was propose in #82556, there is also a hypothesis that #82440 will help.

Jira issue: CRDB-16600

Epic CRDB-14607

irfansharif · 2022-07-04T13:56:08Z

Here's what I've used to repro the effects of a large column backfill saturating available disk bandwidth and severely degrading p99s. Set up a TPC-C with 1000 warehouses on a 3-node CRDB cluster and start the foreground workload:

set CLUSTER irfansharif-backfill
roachprod create -n 4 --gce-machine-type=n2-standard-8 $CLUSTER -l 24h
roachprod sync # fight annoying resolution issues
roachprod put $CLUSTER:1-4 ./cockroach ./cockroach 
roachprod start $CLUSTER:1-3
roachprod run $CLUSTER:1 -- ./cockroach workload fixtures import tpcc --warehouses=1000 --checks=false # takes ~6m
roachprod run $CLUSTER:4 -- ./cockroach workload run tpcc --warehouses=1000 --max-rate=450 --tolerate-errors (roachprod pgurl $CLUSTER:1-3)

Manually lower the available disk bandwidth on all the nodes:

$ lsblk
NAME    MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0     7:0    0  55.4M  1 loop /snap/core18/2066
loop1     7:1    0 230.1M  1 loop /snap/google-cloud-sdk/184
loop2     7:2    0  67.6M  1 loop /snap/lxd/20326
loop3     7:3    0  32.1M  1 loop /snap/snapd/12057
sda       8:0    0    10G  0 disk
├─sda1    8:1    0   9.9G  0 part /
├─sda14   8:14   0     4M  0 part
└─sda15   8:15   0   106M  0 part /boot/efi
nvme0n1 259:0    0   375G  0 disk /mnt/data1

nvme0n1 is the block device cockroach is using here, the MAJ:MIN number being 259:0 . With roachprod started cockroach process, you can tell the kernel to restrict the blkio write bandwidth using the following (40MiB == 41943040, a value chosen such that it's somewhat sufficient for the foreground traffic without severely affecting p99s):

roachprod run $CLUSTER:1-3 'sudo bash -c \'echo "259:0  41943040" > /sys/fs/cgroup/blkio/system.slice/cockroach.service/blkio.throttle.write_bps_device\''

I ran the above at 1:25 (before then the bandwidth limit was at 50MiB/s, which is why the p99 was pretty high, though steady -- installing any bandwidth limit should increase the p99s):

The write bandwidth chart should flatline at the specified limit if you're pushing higher. In the screenshot below we see it flatlining at 60MiB/s during an experiment where that was the limit, and I was running a large backfill starting at 00:34:

Up the backfill batch size to something large to generate large requests during the backfill:

SET CLUSTER SETTING bulkio.column_backfill.batch_size = 200000;

Kick off a backfill on tpcc.order_line (just a large table):

SET use_declarative_schema_changer = off;
ALTER TABLE tpcc.order_line ADD COLUMN junk STRING AS (ol_o_id::string) STORED;

We should observe severely degraded p99s going forward, since we're hitting disk saturation:

irfansharif · 2022-07-04T13:57:46Z

(In the future when we develop better machinery around admission control to model disk bandwidth as a bottleneck resource, and improve latency isolation in the presence of large backfills, perhaps some of the above could serve as material for a roachtest.)

lunevalex added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv Anything in KV that doesn't belong in a more specific category. T-kv KV Team labels Jun 10, 2022

lunevalex assigned irfansharif Jun 10, 2022

lunevalex added C-investigation Further steps needed to qualify. C-label will change. and removed C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) labels Jun 10, 2022

jlinder added the sync-me-8 label Jun 10, 2022

irfansharif closed this as completed Jul 4, 2022

irfansharif reopened this Jul 5, 2022

irfansharif changed the title ~~kv: build a reproduction for a column backfill impacting foreground traffic~~ kv: roachtest-ify column backfills impacting foreground traffic Jul 5, 2022

irfansharif removed their assignment Jul 5, 2022

irfansharif changed the title ~~kv: roachtest-ify column backfills impacting foreground traffic~~ kv: build a reproduction for a column backfill impacting foreground traffic Jul 5, 2022

irfansharif closed this as completed Jul 5, 2022

irfansharif mentioned this issue Jul 5, 2022

admission: roachtest-ify index/column backfills impacting foreground traffic #83826

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kv: build a reproduction for a column backfill impacting foreground traffic #82700

kv: build a reproduction for a column backfill impacting foreground traffic #82700

lunevalex commented Jun 10, 2022 •

edited by exalate-issue-sync bot

Loading

irfansharif commented Jul 4, 2022

irfansharif commented Jul 4, 2022

kv: build a reproduction for a column backfill impacting foreground traffic #82700

kv: build a reproduction for a column backfill impacting foreground traffic #82700

Comments

lunevalex commented Jun 10, 2022 • edited by exalate-issue-sync bot Loading

irfansharif commented Jul 4, 2022

irfansharif commented Jul 4, 2022

lunevalex commented Jun 10, 2022 •

edited by exalate-issue-sync bot

Loading