Simplify getCandidateCellsForSweeping() #2439

gbonik · 2017-10-05T18:51:58Z

Initially I hoped to take advantage of in-database filtering if we get
to implement the "transaction table sweeping" feature. However, seems
like that wasn't a great decision on my part - unclear when (and if) we
get that feature done, and also how much of improvement we would
actually get. The extra logic makes the code significantly more complex,
so I think we need to back off and give up the idea.

This doesn't invalidate the sweep rewrite project. The new impls still
bring significant performance improvements.

This change is

Initially I hoped to take advantage of in-database filtering if we get to implement the "transaction table sweeping" feature. However, seems like that wasn't a great decision on my part - unclear when (and if) we get that feature done, and also how much of improvement we would actually get. The extra logic makes the code significantly more complex, so I think we need to back off and give up the idea. This doesn't invalidate the sweep rewrite project. The new impls still bring significant performance improvements.

gsheasby

Looks fine to me (just a few nits) - @hsaraogi will also take a look today.

gsheasby · 2017-10-09T14:28:51Z

...lient/src/main/java/com/palantir/atlasdb/keyvalue/impl/GetCandidateCellsForSweepingShim.java

                        candidateBatch.add(ImmutableCandidateCellForSweeping.builder()
                                .cell(cell)
-                                .sortedTimestamps(candidate ? timestampArr : EMPTY_LONG_ARRAY)
+                                .sortedTimestamps(timestampArr)


I had to check if we should still be returning EMPTY_LONG_ARRAY if there is just one timestamp - I see that case excluded at SweepableCellFilter.getCellToSweep, so looks ok to me.

Even if there is one timestamp, you still might need to sweep it (if that timestamp is uncommitted)

gsheasby · 2017-10-09T14:37:48Z

...c/main/java/com/palantir/atlasdb/keyvalue/impl/AbstractGetCandidateCellsForSweepingTest.java

-        assertThat(getAllCandidates(conservativeRequest(PtBytes.EMPTY_BYTE_ARRAY, 40L, 30L))).isEmpty();
+    public void singleCellSpanningSeveralPages() {
+        new TestDataBuilder()
+                .put(10, 1, 1000)


nit: 1000L (and similar below)

gsheasby · 2017-10-09T14:38:11Z

...c/main/java/com/palantir/atlasdb/keyvalue/impl/AbstractGetCandidateCellsForSweepingTest.java

@@ -125,7 +120,7 @@ public void returnCellsInOrder() {
                .putEmpty(3, 2, 10L)
                .putEmpty(3, 3, 10L)
                .store();
-        assertThat(getAllCandidates(conservativeRequest(PtBytes.EMPTY_BYTE_ARRAY, 30L, 5L))
+        assertThat(getAllCandidates(conservativeRequest(PtBytes.EMPTY_BYTE_ARRAY, 30L, 100))


nit: 100L (repeated on line 138 below)

that argument is actually an int (the batch size)

hsaraogi · 2017-10-09T17:42:50Z

.../com/palantir/atlasdb/keyvalue/dbkvs/impl/postgres/PostgresGetCandidateCellsForSweeping.java

+                        + "    FROM ("
+                        + "      SELECT row_name, col_name, ts"
+                        + "      FROM " + prefixedTableName
+                        + "      WHERE ts < ? " + bounds.predicates + getIgnoredTimestampPredicate()


should thorough have the getIgnoredTimestampPredicate? This brings us to the question if shouldCheckIfLatestValueIsEmpty and timestampsToIgnore should be two separate values or grouped as SweepStrategyParams (maybe an enum with types CONSERVATIVE/ThOROUGH).

timestampsToIgnore will be -1 for Conservative and empty for THOROUGH and we can add binds in the conservative query as well..

We can discuss what the API should be (and in fact we could have discussed this a few months back when I introduced it), but in this PR I'm trying to keep changes minimal.

This is a question of the level of abstraction. In the current design, KVS doesn't really know anything about sweep strategies; we just tell it whether to ignore timestamps and/or load empty values. In this design, the two concepts are orthogonal, and that's why you see getIgnoredTimestampPredicate() in both "if" branches. We could change the design and make KVS aware of sweep strategies, but that's beyond the scope of this pull request.

I am trying to point out that shouldCheckIfLatestValueIsEmpty and timestampsToIgnore are not independent concepts but are correlated based on the sweep strategy, seems like a bad design to allow many more cases here when there are exactly two possibilities.

Open to the idea of doing this in a separate PR, but I think allowing exactly two sets of values will simplify the queries here.

In general, I think that the KVS should not know about sweep - its one job is to make the appropriate calls to the DB (via the clientPool, in the C* case) in a correct and efficient way. It shouldn't have to know why the user wants to get these timestamps - +1 to keeping as-is.

However, the boundaries are blurred somewhat as we're specifically satisfying a getCandidateCellsForSweeping request, implying that it's reasonable to know something about what being a CandidateCellForSweeping might entail.

But also, the branching here is no more than it was before this PR, and I don't think it's an egregious combinatorial explosion.

On balance, I'd like this to merge as is - any refactor to try and cut down on the branching should be in a separate PR, and should also be a low priority, given the limited potential for simplification and the urgency of getting this + Oracle sweep merged in to develop.

Regarding removing branching - I want to benchmark a different query later that would eliminate this branching (for thorough, check values for all rows rather than only load the latest values). But of course I need to get the Oracle part done first - we can get back to this later.

hsaraogi · 2017-10-09T17:47:16Z

...ntir/atlasdb/keyvalue/dbkvs/impl/postgres/DbKvsPostgresGetCandidateCellsForSweepingTest.java

-    }
-
-    @Test
-    public void returnFirstAndLastCellOfThePage() {


This test still seems valuable, why did we get rid of this?

This tested implementation-specific behavior that doesn't exist anymore

codecov-io · 2017-10-09T19:15:52Z

Codecov Report

Merging #2439 into feature/sweep-rewrite will decrease coverage by 0.03%.
The diff coverage is 97.29%.

@@                     Coverage Diff                     @@
##             feature/sweep-rewrite    #2439      +/-   ##
===========================================================
- Coverage                    60.36%   60.32%   -0.04%     
- Complexity                    4470     4690     +220     
===========================================================
  Files                          843      843              
  Lines                        39797    39782      -15     
  Branches                      4096     4083      -13     
===========================================================
- Hits                         24023    24000      -23     
- Misses                       14284    14290       +6     
- Partials                      1490     1492       +2

Impacted Files	Coverage Δ	Complexity Δ
...va/com/palantir/atlasdb/sweep/SweepTaskRunner.java	`92.13% <ø> (-0.09%)`	`19 <0> (ø)`
...om/palantir/atlasdb/keyvalue/dbkvs/impl/DbKvs.java	`84.45% <ø> (-0.03%)`	`137 <0> (ø)`
...eep/CassandraGetCandidateCellsForSweepingImpl.java	`0% <0%> (ø)`	`0 <0> (ø)`	⬇️
...impl/AbstractGetCandidateCellsForSweepingTest.java	`97.69% <100%> (-0.7%)`	`21 <4> (-2)`
...eyvalue/impl/GetCandidateCellsForSweepingShim.java	`87.14% <100%> (+1.61%)`	`14 <0> (-9)`	⬇️
...postgres/PostgresGetCandidateCellsForSweeping.java	`97.93% <100%> (+0.82%)`	`2 <1> (-2)`	⬇️
...java/com/palantir/lock/impl/LockClientIndices.java	`85% <0%> (-10%)`	`5% <0%> (ø)`
...ain/java/com/palantir/leader/LeadershipEvents.java	`83.33% <0%> (-8.34%)`	`0% <0%> (ø)`
...rc/main/java/com/palantir/lock/LockCollection.java	`66.66% <0%> (-4.77%)`	`11% <0%> (-1%)`
...yvalue/dbkvs/impl/ConnectionManagerAwareDbKvs.java	`88.88% <0%> (-3.71%)`	`5% <0%> (-1%)`
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3f9b464...77b8c16. Read the comment docs.

@output

* The big sweep rewrite, part 3: Postgres impl (#2021) * The big sweep rewrite, part 3: Postgres impl - Implement KeyValueService.getCandidateCellsForSweeping() for Postgres - Add AgnosticResultRow.getBoolean() and LightAgnosticResultRow.getArray() * Address CR comments Also change the stopping condition in computeNext() * Add a release note * Add the ORDER BY that I forgot * Rewrite a fragile test * Join pages based on the number of (cell, ts) pairs * Move release note to develop * The big sweep rewrite, part 4: Cassandra impl (#2231) * The big sweep rewrite, part 4: Cassandra impl Iterate over cells in a table without ever loading more than a given number at once regardless of how wide the rows are, but still perform efficiently on narrow tables. This should finally allow us to run sweep safely on Cassandra with decent performance. * Address clockfort's comments + checkstyle + add release notes * Address code review comments * Split up test cases in SingleRowColumnPagerTest * Tests for KeyRanges * Update locks * Return Optional instead of passing in @output list * Update dependency locks * Fix TestDataBuilder after merge * Make tests green - Fix a small bug in CassandraGetCandidateCellsForSweepingImpl - Remove an irrelevant test (should_not_oom_when_...) and make CassKVSeSweepTaskRunnerIntegrationTest non-parameterized. (getRangeOfTimestamps() is not used in the sweep code path anymore) - Relax assertions in testSweepBatchesUpToDeleteBatchSize * Bring back the old variable name to make checkstyle happy * Run ./gradlew generateLock saveLock * Reapply changes * Fix compile errors * Simplify getCandidateCellsForSweeping() (#2439) * Simplify getCandidateCellsForSweeping() Initially I hoped to take advantage of in-database filtering if we get to implement the "transaction table sweeping" feature. However, seems like that wasn't a great decision on my part - unclear when (and if) we get that feature done, and also how much of improvement we would actually get. The extra logic makes the code significantly more complex, so I think we need to back off and give up the idea. This doesn't invalidate the sweep rewrite project. The new impls still bring significant performance improvements. * The big sweep rewrite, part 5: Oracle impl - Extract the paging logic from the Postgres impl into a separate class named CandidatePagingState so that we could reuse it in the Oracle impl (and also test the logic with unit tests) - Add a new benchmark fullTableScanOneWideRowThorough that simulates sweep of a very wide row. This is important for Oracle - a naive implementation would perform extremely poorly in that scenario. * A bunch of refactors - Make a builder for FullQuery - Rename RangeBoundPredicates -> RangePredicateHelper and make it append to a FullQuery.Builder instead of returning a string + args. - Make DbKvsGetCandidateCellsForSweeping a concrete class with common logic. Now, database-specific logic is behind the CellTsPairLoader interface. - Add full test coverage for CandidatePagingState * Tautological tests for RangePredicateHelper * Import the "correct" assertThat * Static classes * Refactors * CandidatePagingState: extract state-updating methods Also, reorganised class * Expand the comment in OracleCellTsPageLoaderFactory Also filter out empty pages * Fix release notes * Fix release notes for real

gsheasby self-assigned this Oct 9, 2017

gsheasby requested review from gsheasby and hsaraogi October 9, 2017 14:14

gsheasby reviewed Oct 9, 2017

View reviewed changes

hsaraogi suggested changes Oct 9, 2017

View reviewed changes

Int -> long

77b8c16

gsheasby approved these changes Oct 10, 2017

View reviewed changes

gbonik merged commit 801904a into feature/sweep-rewrite Oct 10, 2017

felixdesouza deleted the feature/simplify-get-candidates branch July 2, 2019 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify getCandidateCellsForSweeping() #2439

Simplify getCandidateCellsForSweeping() #2439

gbonik commented Oct 5, 2017 •

edited by jboreiko

Loading

gsheasby left a comment

gsheasby Oct 9, 2017

gbonik Oct 9, 2017

gsheasby Oct 9, 2017

gbonik Oct 9, 2017

gsheasby Oct 9, 2017

gbonik Oct 9, 2017

hsaraogi Oct 9, 2017 •

edited

Loading

gbonik Oct 9, 2017

hsaraogi Oct 9, 2017 •

edited

Loading

gsheasby Oct 10, 2017 •

edited

Loading

gbonik Oct 10, 2017

hsaraogi Oct 9, 2017

gbonik Oct 9, 2017

codecov-io commented Oct 9, 2017

Simplify getCandidateCellsForSweeping() #2439

Simplify getCandidateCellsForSweeping() #2439

Conversation

gbonik commented Oct 5, 2017 • edited by jboreiko Loading

gsheasby left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hsaraogi Oct 9, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hsaraogi Oct 9, 2017 • edited Loading

Choose a reason for hiding this comment

gsheasby Oct 10, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-io commented Oct 9, 2017

Codecov Report

gbonik commented Oct 5, 2017 •

edited by jboreiko

Loading

hsaraogi Oct 9, 2017 •

edited

Loading

hsaraogi Oct 9, 2017 •

edited

Loading

gsheasby Oct 10, 2017 •

edited

Loading