Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: introduce MVCC-compliant index backfiller #73878

Merged
merged 2 commits into from
Feb 16, 2022

Conversation

stevendanna
Copy link
Collaborator

@stevendanna stevendanna commented Dec 15, 2021

Previously, the index backfilling process depended upon non-MVCC
compliant AddSSTable calls which potentially rewrote previously read
historical values.

To support an MVCC-compliant AddSSTable that writes at the current
timestamp, this change implements a new backfilling process described
in the following RFC:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md

In summary, the new index backfilling process depends on backfilling
the new index when it is in a BACKFILLING state (added in #72281). In
this state it receives no writes or deletes. Writes that occur during
the backfilling process are captured by a "temporary index." This
temporary index uses the DeletePreservingEncoding to ensure it
captures deletes as well as writes.

After the of bulk backfill using the MVCC-compliant AddSSTable, the
index is moved into a MERGING state
(added in #75663) in which it receives writes and deletes. Writes
previously captured by the temporary index are then transactionally
merged into the newly added index.

This feature is currently behind a new boolean cluster setting which
default to true. Schema changes that contains both old and new-style
backfills are rejected.

Reverting the default to false will require updating various tests
since many tests depend on the exact index IDs of newly added indexes.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 036c4af to 882e35c Compare December 16, 2021 03:06
@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 18c649c to 27ed8e7 Compare January 4, 2022 15:03
@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch 8 times, most recently from 2ead532 to e0771d9 Compare January 24, 2022 11:52
@stevendanna stevendanna changed the title wip: index backfiller using temporary indexes sql: introduce MVCC-compliant index backfiller Jan 24, 2022
@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch 5 times, most recently from aa2d061 to 679bafc Compare January 27, 2022 10:45
@stevendanna stevendanna marked this pull request as ready for review January 27, 2022 10:46
@stevendanna stevendanna requested a review from a team January 27, 2022 10:46
@stevendanna stevendanna requested review from a team as code owners January 27, 2022 10:46
@stevendanna stevendanna requested a review from a team January 27, 2022 10:46
@stevendanna stevendanna requested a review from a team as a code owner January 27, 2022 10:46
@stevendanna stevendanna requested review from msbutler and miretskiy and removed request for a team January 27, 2022 10:46
@stevendanna
Copy link
Collaborator Author

@dt There are still some TODOs here, but I was wondering if you want to give this a review pass?

@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 679bafc to bd9e5e2 Compare January 27, 2022 11:19
@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch 2 times, most recently from 20322d8 to ad30009 Compare February 2, 2022 17:44
stevendanna added a commit to stevendanna/cockroach that referenced this pull request Feb 4, 2022
This change was pulled out of the index backfiller PR (cockroachdb#73878). It
threads the new WriteAtRequestTimestamp option supported by AddSSTable
into the bulk processor.

In this PR nothing sets this to true, so it should be a no-op, but it
means one less file to keep rebasing when people steal my integer.

Release note: None
@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch from ad30009 to 81f6c2c Compare February 8, 2022 11:46
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR wasn't very surprising, which is good. On this first pass, I mostly just had nits and a couple of questions. I'm inclined to say we should merge this with some urgency and then keep iterating.

pkg/sql/backfill.go Outdated Show resolved Hide resolved
// TODO(ssd): This could be its own boolean or we could store the ID
// of the index it is a temporary index for.
func (w index) IsTemporaryIndexForBackfill() bool {
return w.desc.UseDeletePreservingEncoding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

➕ it might be nice to be able to associate the two indexes.

Comment on lines +2144 to +2145
// If we are adding an index, we add another mutation for the
// temporary index used by the index backfiller.
//
// The index backfiller code currently assumes that it can
// always find the temporary indexes in the Mutations array,
// in same order as the adding indexes.
if idxMut, ok := m.Descriptor_.(*descpb.DescriptorMutation_Index); ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need version gating?

pkg/sql/execinfrapb/processors_bulk_io.proto Outdated Show resolved Hide resolved
pkg/sql/mvcc_backfiller.go Outdated Show resolved Hide resolved
Comment on lines +1095 to +1193
if m.Adding() {
if m.Backfilling() {
tbl.Mutations[m.MutationOrdinal()].State = descpb.DescriptorMutation_DELETE_ONLY
runStatus = RunningStatusDeleteOnly
} else if m.DeleteOnly() {
tbl.Mutations[m.MutationOrdinal()].State = descpb.DescriptorMutation_MERGING
runStatus = RunningStatusMerging
}
}
}
if runStatus == "" || tbl.Dropped() {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this and the stepping twice all feels a little brittle. is there a test which resumes between the two steps?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test that randomly pauses before calls to (*schemaChanger).txn and then resume the job and makes sure it succeeds. It's a little slow so it only does one pause by default, but locally I've run it after every txn:

--- PASS: TestPauseBeforeRandomDescTxn (23.87s)                                                                               
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_1 (1.77s)
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_2 (1.78s)                                        
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_3 (1.74s)                                                
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_4 (1.80s)                                                
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_5 (1.91s)                                                
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_6 (1.72s)                                        
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_7 (1.77s)                                                
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_8 (1.72s)                                                
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_9 (1.67s)                                                                                                                                                                              
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_10 (1.82s)                                               
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_11 (1.87s)                                               
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_12 (1.94s)
    --- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_13 (1.74s)   

I think we need to add more assertions that we don't have any unexpected cruft and it would also be nice to add some concurrent operations, but hopefully this gets at the basic problem you are pointing out here.

@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch 6 times, most recently from 6e7ad8b to 36d4807 Compare February 10, 2022 10:13
@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 143a356 to 3d2daaf Compare February 10, 2022 13:30
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took me so long to actually make my way through this. My comments are largely superficial and can be made later as cleanup. Let's get this merged so we can gain experience with it

@@ -182,7 +182,12 @@ func createSchemaChangeJobsFromMutations(
}
spanList := make([]jobspb.ResumeSpanList, mutationCount)
for i := range spanList {
spanList[i] = jobspb.ResumeSpanList{ResumeSpans: []roachpb.Span{tableDesc.PrimaryIndexSpan(codec)}}
mut := tableDesc.Mutations[idx+i]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this might benefit from commentary explaining why it's doing what it's doing. Can be follow-up, just while I'm here, this is remarkably impactful and a bit subtle.

pkg/sql/backfill/mvcc_index_merger.go Outdated Show resolved Hide resolved
pkg/sql/backfill/mvcc_index_merger.go Outdated Show resolved Hide resolved
Comment on lines +216 to +209
// For now just grab all of the destination KVs and merge the corresponding entries.
kvs, err := txn.Scan(ctx, key, endKey, chunkSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were being nit-picky, I'd encourage you to use the lower-level APIs. These high-level APIs are sort of garbage. I'd consider something like:

var ba roachpb.BatchRequest
	ba.TargetBytes = 16 << 10
	ba.MaxSpanRequestKeys = chunkSize
	ba.Add(&roachpb.ScanRequest{
		RequestHeader: roachpb.RequestHeader{
			Key:    key,
			EndKey: endKey,
		},
		ScanFormat: roachpb.KEY_VALUES,
	})
	br, pErr := txn.Send(ctx, ba)
	if pErr != nil {
		return pErr.GoError()
	}
	resp := br.Responses[0].GetScan()
	for _, row := range resp.Rows {
		
	}

Otherwise you're liable to pull a lot more data than you wanted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to update to this now. Was 16Kb just a bet guess or do you have some thinking there I can include in a comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to push this to a follow up PR. Will open it shortly after this merges.

pkg/sql/distsql_physical_planner.go Outdated Show resolved Hide resolved
pkg/sql/distsql_plan_backfill.go Outdated Show resolved Hide resolved
Previously, the index backfilling process depended upon non-MVCC
compliant AddSSTable calls which potentially rewrote previously read
historical values.

To support an MVCC-compliant AddSSTable that writes at the _current_
timestamp, this change implements a new backfilling process described
in the following RFC:

https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md

In summary, the new index backfilling process depends on backfilling
the new index when it is in a BACKFILLING state (added in cockroachdb#72281). In
this state it receives no writes or deletes. Writes that occur during
the backfilling process are captured by a "temporary index."  This
temporary index uses the DeletePreservingEncoding to ensure it
captures deletes as well as writes.

After the of bulk backfill using the MVCC-compliant AddSSTable, the
index is moved into a MERGING state
(added in cockroachdb#75663) in which it receives writes and deletes. Writes
previously captured by the temporary index are then transactionally
merged into the newly added index.

This feature is currently behind a new boolean cluster setting which
default to true. Schema changes that contains both old and new-style
backfills are rejected.

Reverting the default to false will require updating various tests
since many tests depend on the exact index IDs of newly added indexes.

Release note: None

Co-authored-by: Rui Hu <[email protected]>
@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch from a5a7319 to 2e8dc3e Compare February 16, 2022 13:57
This distributes and checkpoints the index merging process. The
merging process checkpoint is per temporary index.

Release note: None

Co-authored-by: Steven Danna <[email protected]>
@stevendanna stevendanna force-pushed the ssd/index-backfiller-rewrite branch from 2e8dc3e to e47317f Compare February 16, 2022 14:30
@stevendanna
Copy link
Collaborator Author

bors r=ajwerner

@craig
Copy link
Contributor

craig bot commented Feb 16, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants