sql: introduce MVCC-compliant index backfiller #73878

stevendanna · 2021-12-15T22:58:13Z

Previously, the index backfilling process depended upon non-MVCC
compliant AddSSTable calls which potentially rewrote previously read
historical values.

To support an MVCC-compliant AddSSTable that writes at the current
timestamp, this change implements a new backfilling process described
in the following RFC:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md

In summary, the new index backfilling process depends on backfilling
the new index when it is in a BACKFILLING state (added in #72281). In
this state it receives no writes or deletes. Writes that occur during
the backfilling process are captured by a "temporary index." This
temporary index uses the DeletePreservingEncoding to ensure it
captures deletes as well as writes.

After the of bulk backfill using the MVCC-compliant AddSSTable, the
index is moved into a MERGING state
(added in #75663) in which it receives writes and deletes. Writes
previously captured by the temporary index are then transactionally
merged into the newly added index.

This feature is currently behind a new boolean cluster setting which
default to true. Schema changes that contains both old and new-style
backfills are rejected.

Reverting the default to false will require updating various tests
since many tests depend on the exact index IDs of newly added indexes.

Release note: None

cockroach-teamcity · 2021-12-15T22:58:41Z

This change is

stevendanna · 2022-01-27T10:46:54Z

@dt There are still some TODOs here, but I was wondering if you want to give this a review pass?

This change was pulled out of the index backfiller PR (cockroachdb#73878). It threads the new WriteAtRequestTimestamp option supported by AddSSTable into the bulk processor. In this PR nothing sets this to true, so it should be a no-op, but it means one less file to keep rebasing when people steal my integer. Release note: None

ajwerner

This PR wasn't very surprising, which is good. On this first pass, I mostly just had nits and a couple of questions. I'm inclined to say we should merge this with some urgency and then keep iterating.

pkg/sql/backfill.go

ajwerner · 2022-02-08T12:47:32Z

pkg/sql/catalog/tabledesc/index.go

+// TODO(ssd): This could be its own boolean or we could store the ID
+// of the index it is a temporary index for.
+func (w index) IsTemporaryIndexForBackfill() bool {
+	return w.desc.UseDeletePreservingEncoding


➕ it might be nice to be able to associate the two indexes.

ajwerner · 2022-02-08T12:48:44Z

pkg/sql/catalog/tabledesc/structured.go

+	// If we are adding an index, we add another mutation for the
+	// temporary index used by the index backfiller.
+	//
+	// The index backfiller code currently assumes that it can
+	// always find the temporary indexes in the Mutations array,
+	// in same order as the adding indexes.
+	if idxMut, ok := m.Descriptor_.(*descpb.DescriptorMutation_Index); ok {


does this need version gating?

pkg/sql/execinfrapb/processors_bulk_io.proto

pkg/sql/mvcc_backfiller.go

ajwerner · 2022-02-08T12:54:16Z

pkg/sql/schema_changer.go

+			if m.Adding() {
+				if m.Backfilling() {
+					tbl.Mutations[m.MutationOrdinal()].State = descpb.DescriptorMutation_DELETE_ONLY
+					runStatus = RunningStatusDeleteOnly
+				} else if m.DeleteOnly() {
+					tbl.Mutations[m.MutationOrdinal()].State = descpb.DescriptorMutation_MERGING
+					runStatus = RunningStatusMerging
+				}
+			}
+		}
+		if runStatus == "" || tbl.Dropped() {
+			return nil
+		}


this and the stepping twice all feels a little brittle. is there a test which resumes between the two steps?

I've added a test that randomly pauses before calls to (*schemaChanger).txn and then resume the job and makes sure it succeeds. It's a little slow so it only does one pause by default, but locally I've run it after every txn:

--- PASS: TestPauseBeforeRandomDescTxn (23.87s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_1 (1.77s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_2 (1.78s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_3 (1.74s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_4 (1.80s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_5 (1.91s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_6 (1.72s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_7 (1.77s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_8 (1.72s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_9 (1.67s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_10 (1.82s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_11 (1.87s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_12 (1.94s) --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_13 (1.74s)

I think we need to add more assertions that we don't have any unexpected cruft and it would also be nice to add some concurrent operations, but hopefully this gets at the basic problem you are pointing out here.

ajwerner

Sorry it took me so long to actually make my way through this. My comments are largely superficial and can be made later as cleanup. Let's get this merged so we can gain experience with it

ajwerner · 2022-02-15T03:50:19Z

pkg/ccl/backupccl/restore_schema_change_creation.go

@@ -182,7 +182,12 @@ func createSchemaChangeJobsFromMutations(
 		}
 		spanList := make([]jobspb.ResumeSpanList, mutationCount)
 		for i := range spanList {
-			spanList[i] = jobspb.ResumeSpanList{ResumeSpans: []roachpb.Span{tableDesc.PrimaryIndexSpan(codec)}}
+			mut := tableDesc.Mutations[idx+i]


nit: this might benefit from commentary explaining why it's doing what it's doing. Can be follow-up, just while I'm here, this is remarkably impactful and a bit subtle.

pkg/sql/backfill/mvcc_index_merger.go

ajwerner · 2022-02-15T04:02:14Z

pkg/sql/backfill/mvcc_index_merger.go

+		// For now just grab all of the destination KVs and merge the corresponding entries.
+		kvs, err := txn.Scan(ctx, key, endKey, chunkSize)


If I were being nit-picky, I'd encourage you to use the lower-level APIs. These high-level APIs are sort of garbage. I'd consider something like:

var ba roachpb.BatchRequest ba.TargetBytes = 16 << 10 ba.MaxSpanRequestKeys = chunkSize ba.Add(&roachpb.ScanRequest{ RequestHeader: roachpb.RequestHeader{ Key: key, EndKey: endKey, }, ScanFormat: roachpb.KEY_VALUES, }) br, pErr := txn.Send(ctx, ba) if pErr != nil { return pErr.GoError() } resp := br.Responses[0].GetScan() for _, row := range resp.Rows { }

Otherwise you're liable to pull a lot more data than you wanted.

Happy to update to this now. Was 16Kb just a bet guess or do you have some thinking there I can include in a comment?

Decided to push this to a follow up PR. Will open it shortly after this merges.

pkg/sql/distsql_physical_planner.go

pkg/sql/distsql_plan_backfill.go

Previously, the index backfilling process depended upon non-MVCC compliant AddSSTable calls which potentially rewrote previously read historical values. To support an MVCC-compliant AddSSTable that writes at the _current_ timestamp, this change implements a new backfilling process described in the following RFC: https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md In summary, the new index backfilling process depends on backfilling the new index when it is in a BACKFILLING state (added in cockroachdb#72281). In this state it receives no writes or deletes. Writes that occur during the backfilling process are captured by a "temporary index." This temporary index uses the DeletePreservingEncoding to ensure it captures deletes as well as writes. After the of bulk backfill using the MVCC-compliant AddSSTable, the index is moved into a MERGING state (added in cockroachdb#75663) in which it receives writes and deletes. Writes previously captured by the temporary index are then transactionally merged into the newly added index. This feature is currently behind a new boolean cluster setting which default to true. Schema changes that contains both old and new-style backfills are rejected. Reverting the default to false will require updating various tests since many tests depend on the exact index IDs of newly added indexes. Release note: None Co-authored-by: Rui Hu <[email protected]>

This distributes and checkpoints the index merging process. The merging process checkpoint is per temporary index. Release note: None Co-authored-by: Steven Danna <[email protected]>

stevendanna · 2022-02-16T15:42:51Z

bors r=ajwerner

craig · 2022-02-16T17:23:05Z

Build succeeded:

GitHub CI (Cockroach)

stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 036c4af to 882e35c Compare December 16, 2021 03:06

stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 18c649c to 27ed8e7 Compare January 4, 2022 15:03

stevendanna force-pushed the ssd/index-backfiller-rewrite branch 8 times, most recently from 2ead532 to e0771d9 Compare January 24, 2022 11:52

stevendanna changed the title ~~wip: index backfiller using temporary indexes~~ sql: introduce MVCC-compliant index backfiller Jan 24, 2022

stevendanna force-pushed the ssd/index-backfiller-rewrite branch 5 times, most recently from aa2d061 to 679bafc Compare January 27, 2022 10:45

stevendanna marked this pull request as ready for review January 27, 2022 10:46

stevendanna requested a review from a team January 27, 2022 10:46

stevendanna requested review from a team as code owners January 27, 2022 10:46

stevendanna requested a review from a team January 27, 2022 10:46

stevendanna requested a review from a team as a code owner January 27, 2022 10:46

stevendanna requested review from msbutler and miretskiy and removed request for a team January 27, 2022 10:46

stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 679bafc to bd9e5e2 Compare January 27, 2022 11:19

stevendanna force-pushed the ssd/index-backfiller-rewrite branch 2 times, most recently from 20322d8 to ad30009 Compare February 2, 2022 17:44

stevendanna mentioned this pull request Feb 4, 2022

sql: thread WriteAtRequestTimestamp through BackfillerSpec #76064

Closed

stevendanna added the T-disaster-recovery label Feb 7, 2022

stevendanna force-pushed the ssd/index-backfiller-rewrite branch from ad30009 to 81f6c2c Compare February 8, 2022 11:46

ajwerner reviewed Feb 8, 2022

View reviewed changes

stevendanna force-pushed the ssd/index-backfiller-rewrite branch 6 times, most recently from 6e7ad8b to 36d4807 Compare February 10, 2022 10:13

stevendanna mentioned this pull request Feb 10, 2022

migrations: add migration to wait on in-flight schema changes #76154

Merged

stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 143a356 to 3d2daaf Compare February 10, 2022 13:30

ajwerner approved these changes Feb 15, 2022

View reviewed changes

stevendanna force-pushed the ssd/index-backfiller-rewrite branch 3 times, most recently from cb3e6be to 236e9e8 Compare February 15, 2022 15:09

ajwerner mentioned this pull request Feb 15, 2022

sql,spanconfig: temp indexes for index backfill should share a configuration with the final index #76608

Closed

stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 236e9e8 to a5a7319 Compare February 16, 2022 09:01

stevendanna force-pushed the ssd/index-backfiller-rewrite branch from a5a7319 to 2e8dc3e Compare February 16, 2022 13:57

sql: distribute the index merging process

e47317f

This distributes and checkpoints the index merging process. The merging process checkpoint is per temporary index. Release note: None Co-authored-by: Steven Danna <[email protected]>

stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 2e8dc3e to e47317f Compare February 16, 2022 14:30

craig bot merged commit 601d840 into cockroachdb:master Feb 16, 2022

irfansharif mentioned this pull request Feb 17, 2022

spanconfig: introduce spanconfig.Splitter #75803

Merged

erikgrinaker mentioned this pull request Apr 25, 2022

sql: new index backfiller not subject to admission control #80464

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: introduce MVCC-compliant index backfiller #73878

sql: introduce MVCC-compliant index backfiller #73878

stevendanna commented Dec 15, 2021 •

edited

Loading

cockroach-teamcity commented Dec 15, 2021

stevendanna commented Jan 27, 2022

ajwerner left a comment

ajwerner Feb 8, 2022

ajwerner Feb 8, 2022

ajwerner Feb 8, 2022

stevendanna Feb 9, 2022

ajwerner left a comment

ajwerner Feb 15, 2022

ajwerner Feb 15, 2022

stevendanna Feb 15, 2022

stevendanna Feb 16, 2022

stevendanna commented Feb 16, 2022

craig bot commented Feb 16, 2022

		// For now just grab all of the destination KVs and merge the corresponding entries.
		kvs, err := txn.Scan(ctx, key, endKey, chunkSize)

sql: introduce MVCC-compliant index backfiller #73878

sql: introduce MVCC-compliant index backfiller #73878

Conversation

stevendanna commented Dec 15, 2021 • edited Loading

cockroach-teamcity commented Dec 15, 2021

stevendanna commented Jan 27, 2022

ajwerner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stevendanna commented Feb 16, 2022

craig bot commented Feb 16, 2022

stevendanna commented Dec 15, 2021 •

edited

Loading