Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libroach: fix excessive compactions performed by DBCompactRange #26355

Merged

Conversation

petermattis
Copy link
Collaborator

Fix excessive compactions from DBCompactRange due to mishandling of
the first and last ranges to compact. When a non-empty start or end key
is specified, DBCompactRange was previously calling
rocksdb::DB::CompactRange with a null start/end key resulting in
compacting from the beginning (or to the end) of the entire key space.

See #24029

@petermattis petermattis requested a review from a team June 4, 2018 13:57
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@petermattis petermattis requested review from bdarnell and tbg June 4, 2018 13:57
@petermattis
Copy link
Collaborator Author

This needs to be back-ported to 2.0.x.

@tbg
Copy link
Member

tbg commented Jun 4, 2018

Ouch, good find. Definitely a simple unit test for the "partitioner" logic would've caught this. Can you be bothered to add it now?


Review status: all files reviewed at latest revision, all discussions resolved, some commit checks pending.


Comments from Reviewable

@petermattis
Copy link
Collaborator Author

Ouch, good find. Definitely a simple unit test for the "partitioner" logic would've caught this. Can you be bothered to add it now?

Now that we have C++ unit tests this should be easier to do. Let me take a look.


Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.


Comments from Reviewable

@bdarnell
Copy link
Contributor

bdarnell commented Jun 4, 2018

:lgtm:


Review status: all files reviewed at latest revision, all discussions resolved, all commit checks successful.


Comments from Reviewable

Fix excessive compactions from `DBCompactRange` due to mishandling of
the first and last ranges to compact. When a non-empty start or end key
is specified, DBCompactRange was previously calling
`rocksdb::DB::CompactRange` with a `null` start/end key resulting in
compacting from the beginning (or to the end) of the entire key space.

See cockroachdb#24029
@petermattis petermattis force-pushed the pmattis/libroach-compact-range branch from e8f8cbe to e2f099f Compare June 4, 2018 17:54
@petermattis
Copy link
Collaborator Author

Refactored so that the core logic can be tested. PTAL.


Review status: 0 of 3 files reviewed at latest revision, all discussions resolved, all commit checks successful.


Comments from Reviewable

@bdarnell
Copy link
Contributor

bdarnell commented Jun 4, 2018

LGTM


Review status: 0 of 3 files reviewed at latest revision, all discussions resolved, all commit checks successful.


Comments from Reviewable

@petermattis
Copy link
Collaborator Author

bors r=bdarnell,tschottdorf

@craig
Copy link
Contributor

craig bot commented Jun 4, 2018

Build failed (retrying...)

@craig
Copy link
Contributor

craig bot commented Jun 4, 2018

Build failed (retrying...)

craig bot pushed a commit that referenced this pull request Jun 4, 2018
24589: distsqlrun: don't allocate between fused processors r=jordanlewis a=jordanlewis

distsqlrun: don't allocate between fused processors

Previously, `ProcOutputHelper.ProcessRow` (and, by extension, all
`RowSource.Next` implementations) always allocated a fresh
`EncDatumRow`. This was wasteful - not every processor needs to be able
to hold a reference to the output of `RowSource.Next`.

Now, `ProcessRow` never allocates a fresh `EncDatumRow`, and the
contract of `RowSource.Next` has been changed to say that it's not valid
to hang on to a row returned by `Next` past the next call to `Next`.
Processors that need to hold on to a row from their upstreams have been
modified to make an explicit copy to achieve this safely.

Finally, a new `copyingRowReceiver` is introduced that makes a copy of
every row that is `Push`'d to it. A `copyingRowReceiver` is inserted
before every router, since routers all expect that their inputs will be
immutable. This preserves the safety of sending outputs of
`RowSource.Next`, which aren't safe to hold on to, to routers, which
expect rows that *are* safe to hold on to.

Release note: None

Fixes #22462.
Fixes #24452.

26355: libroach: fix excessive compactions performed by DBCompactRange r=bdarnell,tschottdorf a=petermattis

Fix excessive compactions from `DBCompactRange` due to mishandling of
the first and last ranges to compact. When a non-empty start or end key
is specified, DBCompactRange was previously calling
`rocksdb::DB::CompactRange` with a `null` start/end key resulting in
compacting from the beginning (or to the end) of the entire key space.

See #24029

Co-authored-by: Jordan Lewis <[email protected]>
Co-authored-by: Peter Mattis <[email protected]>
@craig
Copy link
Contributor

craig bot commented Jun 4, 2018

Build succeeded

@craig craig bot merged commit e2f099f into cockroachdb:master Jun 4, 2018
@petermattis petermattis deleted the pmattis/libroach-compact-range branch June 4, 2018 20:48
craig bot pushed a commit that referenced this pull request Jun 4, 2018
26403: release-2.0: libroach: fix excessive compactions performed by DBCompactRange r=bdarnell,benesch a=petermattis

Backport 1/1 commits from #26355.

/cc @cockroachdb/release

---

Fix excessive compactions from `DBCompactRange` due to mishandling of
the first and last ranges to compact. When a non-empty start or end key
is specified, DBCompactRange was previously calling
`rocksdb::DB::CompactRange` with a `null` start/end key resulting in
compacting from the beginning (or to the end) of the entire key space.

See #24029


Co-authored-by: Peter Mattis <[email protected]>
benesch added a commit to benesch/cockroach that referenced this pull request Jun 6, 2018
Now that DBCompactRange no longer attempts to compact the entire
database (cockroachdb#26355), serializing ClearRange requests is enough to prevent
a large DROP TABLE from bricking a cluster. They're slow enough that the
compaction queue can keep up and purge range deletion tombstones before
enough pile up to wedge the cluster.

TODO(benesch): verify whether actually need to turn down the compaction
queue threshold from 2m.

This is a partial fix for cockroachdb#24029.
benesch added a commit to benesch/cockroach that referenced this pull request Jun 6, 2018
Now that DBCompactRange no longer attempts to compact the entire
database (cockroachdb#26355), sending ClearRange requests in sequential batches of
50is enough to prevent a large DROP TABLE from bricking a cluster.
They're slow enough that the compaction queue can keep up and purge
range deletion tombstones before enough pile up to wedge the cluster.

This is a partial fix for cockroachdb#24029.

Release note (bug fix): The performance impact of dropping a large table
has been substantially reduced.
petermattis pushed a commit to petermattis/cockroach that referenced this pull request Jun 11, 2018
Now that DBCompactRange no longer attempts to compact the entire
database (cockroachdb#26355), sending ClearRange requests in sequential batches of
50is enough to prevent a large DROP TABLE from bricking a cluster.
They're slow enough that the compaction queue can keep up and purge
range deletion tombstones before enough pile up to wedge the cluster.

This is a partial fix for cockroachdb#24029.

Release note (bug fix): The performance impact of dropping a large table
has been substantially reduced.
benesch added a commit to benesch/cockroach that referenced this pull request Jun 12, 2018
Now that DBCompactRange no longer attempts to compact the entire
database (cockroachdb#26355), sending ClearRange requests in sequential batches of
50is enough to prevent a large DROP TABLE from bricking a cluster.
They're slow enough that the compaction queue can keep up and purge
range deletion tombstones before enough pile up to wedge the cluster.

This is a partial fix for cockroachdb#24029.

Release note (bug fix): The performance impact of dropping a large table
has been substantially reduced.
benesch added a commit to benesch/cockroach that referenced this pull request Jun 12, 2018
Now that DBCompactRange no longer attempts to compact the entire
database (cockroachdb#26355), sending ClearRange requests in sequential batches of
50is enough to prevent a large DROP TABLE from bricking a cluster.
They're slow enough that the compaction queue can keep up and purge
range deletion tombstones before enough pile up to wedge the cluster.

This is a partial fix for cockroachdb#24029.

Release note (bug fix): The performance impact of dropping a large table
has been substantially reduced.
benesch added a commit to benesch/cockroach that referenced this pull request Jun 12, 2018
Now that DBCompactRange no longer attempts to compact the entire
database (cockroachdb#26355), sending ClearRange requests in sequential batches of
50is enough to prevent a large DROP TABLE from bricking a cluster.
They're slow enough that the compaction queue can keep up and purge
range deletion tombstones before enough pile up to wedge the cluster.

This is a partial fix for cockroachdb#24029.

Release note (bug fix): The performance impact of dropping a large table
has been substantially reduced.
benesch added a commit to benesch/cockroach that referenced this pull request Jun 13, 2018
Now that DBCompactRange no longer attempts to compact the entire
database (cockroachdb#26355), sending ClearRange requests in sequential batches of
50is enough to prevent a large DROP TABLE from bricking a cluster.
They're slow enough that the compaction queue can keep up and purge
range deletion tombstones before enough pile up to wedge the cluster.

This is a partial fix for cockroachdb#24029.

Release note (bug fix): The performance impact of dropping a large table
has been substantially reduced.
craig bot pushed a commit that referenced this pull request Jun 13, 2018
26615: release-2.0: sql,compactor: rate limit clear range requests r=bdarnell,petermattis a=benesch

Backports #26449.

I'm running a clearrange roachtest with this diff applied tonight. If it passes we're good to go.

```diff
diff --git a/pkg/cmd/roachtest/clearrange.go b/pkg/cmd/roachtest/clearrange.go
index ea5bcdff8..2b244af6d 100644
--- a/pkg/cmd/roachtest/clearrange.go
+++ b/pkg/cmd/roachtest/clearrange.go
@@ -30,19 +30,9 @@ func registerClearRange(r *registry) {
 		// thoroughly brick the cluster.
 		Stable: false,
 		Run: func(ctx context.Context, t *test, c *cluster) {
-			t.Status(`downloading store dumps`)
-			// Created via:
-			// roachtest --cockroach cockroach-v2.0.1 store-gen --stores=10 bank \
-			//           --payload-bytes=10240 --ranges=0 --rows=65104166
-			fixtureURL := `gs://cockroach-fixtures/workload/bank/version=1.0.0,payload-bytes=10240,ranges=0,rows=65104166,seed=1`
-			location := storeDirURL(fixtureURL, c.nodes, "2.0")
+			t.Status(`waiting for compactions to disappear`)
+			time.Sleep(90 * time.Minute)
 
-			// Download this store dump, which measures around 2TB (across all nodes).
-			if err := downloadStoreDumps(ctx, c, location, c.nodes); err != nil {
-				t.Fatal(err)
-			}
-
-			c.Put(ctx, cockroach, "./cockroach")
 			c.Start(ctx)
 
 			// Also restore a much smaller table. We'll use it to run queries against
@@ -81,7 +71,7 @@ func registerClearRange(r *registry) {
 				// above didn't brick the cluster.
 				//
 				// Don't lower this number, or the test may pass erroneously.
-				const minutes = 60
+				const minutes = 120
 				t.WorkerStatus("repeatedly running COUNT(*) on small table")
 				for i := 0; i < minutes; i++ {
 					after := time.After(time.Minute)

```

---

Now that DBCompactRange no longer attempts to compact the entire
database (#26355), sending ClearRange requests in sequential batches of
50is enough to prevent a large DROP TABLE from bricking a cluster.
They're slow enough that the compaction queue can keep up and purge
range deletion tombstones before enough pile up to wedge the cluster.

This is a partial fix for #24029.

Release note (bug fix): The performance impact of dropping a large table
has been substantially reduced.

Co-authored-by: Nikhil Benesch <[email protected]>
craig bot pushed a commit that referenced this pull request Jun 13, 2018
26449: sql,compactor: rate limit clear range requests r=petermattis,bdarnell a=benesch

Now that DBCompactRange no longer attempts to compact the entire
database (#26355), serializing ClearRange requests is enough to prevent
a large DROP TABLE from bricking a cluster. They're slow enough that the
compaction queue can keep up and purge range deletion tombstones before
enough pile up to wedge the cluster.

This hopefully supersedes #26372. It is much more surgical in approach.

TODO(benesch): verify whether actually need to turn down the compaction
queue threshold from 2m.

This is a partial fix for #24029.

Here's a screenshot of this PR in action:

<img width="986" alt="screen shot 2018-06-06 at 12 38 17 am" src="https://user-images.githubusercontent.com/882976/41017238-125718f4-6922-11e8-99f6-dc870618f446.png">

Note that this screenshot was generated from a slightly older version of this patch. I'm rerunning tonight with exactly this patch to verify that I didn't break anything.

Co-authored-by: Nikhil Benesch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants