-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libroach: fix excessive compactions performed by DBCompactRange #26355
libroach: fix excessive compactions performed by DBCompactRange #26355
Conversation
This needs to be back-ported to 2.0.x. |
Ouch, good find. Definitely a simple unit test for the "partitioner" logic would've caught this. Can you be bothered to add it now? Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending. Comments from Reviewable |
Now that we have C++ unit tests this should be easier to do. Let me take a look. Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
Fix excessive compactions from `DBCompactRange` due to mishandling of the first and last ranges to compact. When a non-empty start or end key is specified, DBCompactRange was previously calling `rocksdb::DB::CompactRange` with a `null` start/end key resulting in compacting from the beginning (or to the end) of the entire key space. See cockroachdb#24029
e8f8cbe
to
e2f099f
Compare
Refactored so that the core logic can be tested. PTAL. Review status: 0 of 3 files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
LGTM Review status: 0 of 3 files reviewed at latest revision, all discussions resolved, all commit checks successful. Comments from Reviewable |
bors r=bdarnell,tschottdorf |
Build failed (retrying...) |
Build failed (retrying...) |
24589: distsqlrun: don't allocate between fused processors r=jordanlewis a=jordanlewis distsqlrun: don't allocate between fused processors Previously, `ProcOutputHelper.ProcessRow` (and, by extension, all `RowSource.Next` implementations) always allocated a fresh `EncDatumRow`. This was wasteful - not every processor needs to be able to hold a reference to the output of `RowSource.Next`. Now, `ProcessRow` never allocates a fresh `EncDatumRow`, and the contract of `RowSource.Next` has been changed to say that it's not valid to hang on to a row returned by `Next` past the next call to `Next`. Processors that need to hold on to a row from their upstreams have been modified to make an explicit copy to achieve this safely. Finally, a new `copyingRowReceiver` is introduced that makes a copy of every row that is `Push`'d to it. A `copyingRowReceiver` is inserted before every router, since routers all expect that their inputs will be immutable. This preserves the safety of sending outputs of `RowSource.Next`, which aren't safe to hold on to, to routers, which expect rows that *are* safe to hold on to. Release note: None Fixes #22462. Fixes #24452. 26355: libroach: fix excessive compactions performed by DBCompactRange r=bdarnell,tschottdorf a=petermattis Fix excessive compactions from `DBCompactRange` due to mishandling of the first and last ranges to compact. When a non-empty start or end key is specified, DBCompactRange was previously calling `rocksdb::DB::CompactRange` with a `null` start/end key resulting in compacting from the beginning (or to the end) of the entire key space. See #24029 Co-authored-by: Jordan Lewis <[email protected]> Co-authored-by: Peter Mattis <[email protected]>
Build succeeded |
26403: release-2.0: libroach: fix excessive compactions performed by DBCompactRange r=bdarnell,benesch a=petermattis Backport 1/1 commits from #26355. /cc @cockroachdb/release --- Fix excessive compactions from `DBCompactRange` due to mishandling of the first and last ranges to compact. When a non-empty start or end key is specified, DBCompactRange was previously calling `rocksdb::DB::CompactRange` with a `null` start/end key resulting in compacting from the beginning (or to the end) of the entire key space. See #24029 Co-authored-by: Peter Mattis <[email protected]>
Now that DBCompactRange no longer attempts to compact the entire database (cockroachdb#26355), serializing ClearRange requests is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. TODO(benesch): verify whether actually need to turn down the compaction queue threshold from 2m. This is a partial fix for cockroachdb#24029.
Now that DBCompactRange no longer attempts to compact the entire database (cockroachdb#26355), sending ClearRange requests in sequential batches of 50is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. This is a partial fix for cockroachdb#24029. Release note (bug fix): The performance impact of dropping a large table has been substantially reduced.
Now that DBCompactRange no longer attempts to compact the entire database (cockroachdb#26355), sending ClearRange requests in sequential batches of 50is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. This is a partial fix for cockroachdb#24029. Release note (bug fix): The performance impact of dropping a large table has been substantially reduced.
Now that DBCompactRange no longer attempts to compact the entire database (cockroachdb#26355), sending ClearRange requests in sequential batches of 50is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. This is a partial fix for cockroachdb#24029. Release note (bug fix): The performance impact of dropping a large table has been substantially reduced.
Now that DBCompactRange no longer attempts to compact the entire database (cockroachdb#26355), sending ClearRange requests in sequential batches of 50is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. This is a partial fix for cockroachdb#24029. Release note (bug fix): The performance impact of dropping a large table has been substantially reduced.
Now that DBCompactRange no longer attempts to compact the entire database (cockroachdb#26355), sending ClearRange requests in sequential batches of 50is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. This is a partial fix for cockroachdb#24029. Release note (bug fix): The performance impact of dropping a large table has been substantially reduced.
Now that DBCompactRange no longer attempts to compact the entire database (cockroachdb#26355), sending ClearRange requests in sequential batches of 50is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. This is a partial fix for cockroachdb#24029. Release note (bug fix): The performance impact of dropping a large table has been substantially reduced.
26615: release-2.0: sql,compactor: rate limit clear range requests r=bdarnell,petermattis a=benesch Backports #26449. I'm running a clearrange roachtest with this diff applied tonight. If it passes we're good to go. ```diff diff --git a/pkg/cmd/roachtest/clearrange.go b/pkg/cmd/roachtest/clearrange.go index ea5bcdff8..2b244af6d 100644 --- a/pkg/cmd/roachtest/clearrange.go +++ b/pkg/cmd/roachtest/clearrange.go @@ -30,19 +30,9 @@ func registerClearRange(r *registry) { // thoroughly brick the cluster. Stable: false, Run: func(ctx context.Context, t *test, c *cluster) { - t.Status(`downloading store dumps`) - // Created via: - // roachtest --cockroach cockroach-v2.0.1 store-gen --stores=10 bank \ - // --payload-bytes=10240 --ranges=0 --rows=65104166 - fixtureURL := `gs://cockroach-fixtures/workload/bank/version=1.0.0,payload-bytes=10240,ranges=0,rows=65104166,seed=1` - location := storeDirURL(fixtureURL, c.nodes, "2.0") + t.Status(`waiting for compactions to disappear`) + time.Sleep(90 * time.Minute) - // Download this store dump, which measures around 2TB (across all nodes). - if err := downloadStoreDumps(ctx, c, location, c.nodes); err != nil { - t.Fatal(err) - } - - c.Put(ctx, cockroach, "./cockroach") c.Start(ctx) // Also restore a much smaller table. We'll use it to run queries against @@ -81,7 +71,7 @@ func registerClearRange(r *registry) { // above didn't brick the cluster. // // Don't lower this number, or the test may pass erroneously. - const minutes = 60 + const minutes = 120 t.WorkerStatus("repeatedly running COUNT(*) on small table") for i := 0; i < minutes; i++ { after := time.After(time.Minute) ``` --- Now that DBCompactRange no longer attempts to compact the entire database (#26355), sending ClearRange requests in sequential batches of 50is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. This is a partial fix for #24029. Release note (bug fix): The performance impact of dropping a large table has been substantially reduced. Co-authored-by: Nikhil Benesch <[email protected]>
26449: sql,compactor: rate limit clear range requests r=petermattis,bdarnell a=benesch Now that DBCompactRange no longer attempts to compact the entire database (#26355), serializing ClearRange requests is enough to prevent a large DROP TABLE from bricking a cluster. They're slow enough that the compaction queue can keep up and purge range deletion tombstones before enough pile up to wedge the cluster. This hopefully supersedes #26372. It is much more surgical in approach. TODO(benesch): verify whether actually need to turn down the compaction queue threshold from 2m. This is a partial fix for #24029. Here's a screenshot of this PR in action: <img width="986" alt="screen shot 2018-06-06 at 12 38 17 am" src="https://user-images.githubusercontent.com/882976/41017238-125718f4-6922-11e8-99f6-dc870618f446.png"> Note that this screenshot was generated from a slightly older version of this patch. I'm rerunning tonight with exactly this patch to verify that I didn't break anything. Co-authored-by: Nikhil Benesch <[email protected]>
Fix excessive compactions from
DBCompactRange
due to mishandling ofthe first and last ranges to compact. When a non-empty start or end key
is specified, DBCompactRange was previously calling
rocksdb::DB::CompactRange
with anull
start/end key resulting incompacting from the beginning (or to the end) of the entire key space.
See #24029