bulk: pre-split more before ingesting unsorted data #74826

dt · 2022-01-13T23:20:22Z

It has been observed that during a backfill or import, the split-as-we-go behavior used by the sst batcher, where it only explicitly splits and scatters after sending some amount of data all to one range, does not send any splits in the steady state when importing out of order data: since it is out of order, after chunking it by destination range, no one chunk is large enough to be considered filling that range enough to cause us to send a split. Of course, the range may fill on its own, as many processors send it small chunks over and over and those add up, but no one processor knows when that happens, so we simply leave it up to the range to split itself when it is full as it would in any other heavy write-rate, non-bulk ingestion case.

However, we've seen this lead to hotspots during bulk ingestion. Until the range becomes full and decides to split, it can be a bottleneck as many processors send it data at once. Existing mechanisms for kv-driven load-based splitting are not tuned for ingestion workloads. When it does decide to split and rebalance, that is a more expensive operation, as we're now needing to read, move and ingest half of that full range that we just spend so much work sending data to and ingesting, all while still being sent more ingestion load.

Ideally, we'd prefer to split the span before we start ingesting, both to spread ingestion load over our available capacity better, and to reduce how much ingested data it will need to shuffle if we waited to split and scatter only once a range filled. By scattering first, we'll just ingest directly to the right place initially. However a challenge in doing this is that in many cases, we don't know what data we'll be ingesting until we start reading input and producing it -- it could be that we're working off of a key-ordered CSV, or it could be that we have a uniform random distribution. In some cases, such as an index backfill, we may be able to use some external information like a SQL statistics or a sampling scan of the table to derive a good partitioning, but in others, like IMPORTs or view materialization, we have no way to know what the data will be until we start producing it.

However we can do a bit better than not pre-splitting at all though, by instead using the first buffer's worth of data, if it was read out of sorted order, as if it is a representative sample of the rest of the data to come and then generating splits from it and splitting and scatter the target spans using those before proceeding to flush. If this buffer turns out not to be a good sample after all, this is no worse than before, where we pre-split not at all. If the input data was sorted, this step is unnecessary as we'll just split as we go, after each range we fill and scatter the empty remainder before moving to fill it, as we already do today.

Epic: CRDB-2340

blathers-crl · 2022-01-13T23:20:25Z

cc @cockroachdb/bulk-io

dt added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Jan 13, 2022

blathers-crl bot added the T-disaster-recovery label Jan 13, 2022

dt linked a pull request Jan 13, 2022 that will close this issue

bulk,kv: create initial splits in imports and backfills #74816

Merged

exalate-issue-sync bot assigned dt Jan 18, 2022

craig bot closed this as completed in #74816 Feb 4, 2022

github-project-automation bot added this to Disaster Recovery Backlog Aug 28, 2024

github-project-automation bot moved this to Done in Disaster Recovery Backlog Aug 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bulk: pre-split more before ingesting unsorted data #74826

bulk: pre-split more before ingesting unsorted data #74826

dt commented Jan 13, 2022 •

edited

Loading

blathers-crl bot commented Jan 13, 2022

bulk: pre-split more before ingesting unsorted data #74826

bulk: pre-split more before ingesting unsorted data #74826

Comments

dt commented Jan 13, 2022 • edited Loading

blathers-crl bot commented Jan 13, 2022

dt commented Jan 13, 2022 •

edited

Loading