This repository has been archived by the owner on Nov 14, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 15
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* The big sweep rewrite, part 3: Postgres impl (#2021) * The big sweep rewrite, part 3: Postgres impl - Implement KeyValueService.getCandidateCellsForSweeping() for Postgres - Add AgnosticResultRow.getBoolean() and LightAgnosticResultRow.getArray() * Address CR comments Also change the stopping condition in computeNext() * Add a release note * Add the ORDER BY that I forgot * Rewrite a fragile test * Join pages based on the number of (cell, ts) pairs * Move release note to develop * The big sweep rewrite, part 4: Cassandra impl (#2231) * The big sweep rewrite, part 4: Cassandra impl Iterate over cells in a table without ever loading more than a given number at once regardless of how wide the rows are, but still perform efficiently on narrow tables. This should finally allow us to run sweep safely on Cassandra with decent performance. * Address clockfort's comments + checkstyle + add release notes * Address code review comments * Split up test cases in SingleRowColumnPagerTest * Tests for KeyRanges * Update locks * Return Optional instead of passing in @output list * Update dependency locks * Fix TestDataBuilder after merge * Make tests green - Fix a small bug in CassandraGetCandidateCellsForSweepingImpl - Remove an irrelevant test (should_not_oom_when_...) and make CassKVSeSweepTaskRunnerIntegrationTest non-parameterized. (getRangeOfTimestamps() is not used in the sweep code path anymore) - Relax assertions in testSweepBatchesUpToDeleteBatchSize * Bring back the old variable name to make checkstyle happy * Run ./gradlew generateLock saveLock * Reapply changes * Fix compile errors * Simplify getCandidateCellsForSweeping() (#2439) * Simplify getCandidateCellsForSweeping() Initially I hoped to take advantage of in-database filtering if we get to implement the "transaction table sweeping" feature. However, seems like that wasn't a great decision on my part - unclear when (and if) we get that feature done, and also how much of improvement we would actually get. The extra logic makes the code significantly more complex, so I think we need to back off and give up the idea. This doesn't invalidate the sweep rewrite project. The new impls still bring significant performance improvements. * The big sweep rewrite, part 5: Oracle impl - Extract the paging logic from the Postgres impl into a separate class named CandidatePagingState so that we could reuse it in the Oracle impl (and also test the logic with unit tests) - Add a new benchmark fullTableScanOneWideRowThorough that simulates sweep of a very wide row. This is important for Oracle - a naive implementation would perform extremely poorly in that scenario. * A bunch of refactors - Make a builder for FullQuery - Rename RangeBoundPredicates -> RangePredicateHelper and make it append to a FullQuery.Builder instead of returning a string + args. - Make DbKvsGetCandidateCellsForSweeping a concrete class with common logic. Now, database-specific logic is behind the CellTsPairLoader interface. - Add full test coverage for CandidatePagingState * Tautological tests for RangePredicateHelper * Import the "correct" assertThat * Static classes * Refactors * CandidatePagingState: extract state-updating methods Also, reorganised class * Expand the comment in OracleCellTsPageLoaderFactory Also filter out empty pages * Fix release notes * Fix release notes for real
* Exclude PostgresDbPersistenceTimelockTestSuite from ete tests * Rename test as the regex is case sensitive [no release notes]
* Better logging for internal customer * Update docs * Copy over #2444
* change version number * Clarify 0.60.0 release * Delete stray line * Touch up release notes
* sweep full table * fix tests * logsafe * release notes * pr comments * docs * unnecessary method call * checkstyle * fix build * fix build * javadocs * Random logging bits
* DbKvs sweep refactors to address last PR comments As dxiao pointed out, CandidatePagingState does too many things. The solution is to make CellTsPairLoader return an iterator and move all paging logic inside that iterator, and remove CellTsPairLoaderFactory. Then we replace CandidatePagingState with CandidateGroupingIterator, which now doesn't have any paging logic and only takes care of grouping (cell, ts) pairs by cell. Also remove CandidatePageJoiningIterator because it's not used anymore. We could have done this earlier but I forgot. Plan for the future: - Since we gave up on the in-database filtering idea, we can replace KVS.getCandidateCellsForSweeping() with a simpler call like getCellTsPairs() which would do what CellTsPairLoader does now. - Implement the new call for the remaining KVS's. We need at least the InMemoryKVS. We should decide the destiny of JdbcKVS and CqlKVS. - Remove all remaining usages of getRangeOfTimestamps(). I think that's basically deleteRange() on Cassandra. - Remove KVS.getRangeOfTimestamps()! [no release notes] * Renames * Create OracleCellTsPageLoader.PageIterator.Token class * Create PostgresCellTsPageLaoder.PageIterator.Token * Fix Precondition having-next-ness is a state, not an argument * One Token To Rull Them All Well... two, actually. This one is a CellTsPairToken. We already had a Token class in DbKvs :-/ * Make CellTsPairToken final; replace modification with factory methods * Have computeNextStartPosition return a token * Remove the TODO I don't think we can do it nicely * empty commit * Nits * Move duplicated precondition to CellTsPairToken and add error message * Renames * Make CellTsPairToken immutable * Sort timestamps array, just in case
Increase Cell Ts pairs to 1024
This was referenced Aug 30, 2022
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Goals (and why):
Implementation Description (bullets):
Concerns (what feedback would you like?):
Where should we start reviewing?:
Priority (whenever / two weeks / yesterday):
This change is