-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: introduce MVCC-compliant index backfiller #73878
sql: introduce MVCC-compliant index backfiller #73878
Conversation
036c4af
to
882e35c
Compare
18c649c
to
27ed8e7
Compare
2ead532
to
e0771d9
Compare
aa2d061
to
679bafc
Compare
@dt There are still some TODOs here, but I was wondering if you want to give this a review pass? |
679bafc
to
bd9e5e2
Compare
20322d8
to
ad30009
Compare
This change was pulled out of the index backfiller PR (cockroachdb#73878). It threads the new WriteAtRequestTimestamp option supported by AddSSTable into the bulk processor. In this PR nothing sets this to true, so it should be a no-op, but it means one less file to keep rebasing when people steal my integer. Release note: None
ad30009
to
81f6c2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR wasn't very surprising, which is good. On this first pass, I mostly just had nits and a couple of questions. I'm inclined to say we should merge this with some urgency and then keep iterating.
// TODO(ssd): This could be its own boolean or we could store the ID | ||
// of the index it is a temporary index for. | ||
func (w index) IsTemporaryIndexForBackfill() bool { | ||
return w.desc.UseDeletePreservingEncoding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
➕ it might be nice to be able to associate the two indexes.
// If we are adding an index, we add another mutation for the | ||
// temporary index used by the index backfiller. | ||
// | ||
// The index backfiller code currently assumes that it can | ||
// always find the temporary indexes in the Mutations array, | ||
// in same order as the adding indexes. | ||
if idxMut, ok := m.Descriptor_.(*descpb.DescriptorMutation_Index); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need version gating?
if m.Adding() { | ||
if m.Backfilling() { | ||
tbl.Mutations[m.MutationOrdinal()].State = descpb.DescriptorMutation_DELETE_ONLY | ||
runStatus = RunningStatusDeleteOnly | ||
} else if m.DeleteOnly() { | ||
tbl.Mutations[m.MutationOrdinal()].State = descpb.DescriptorMutation_MERGING | ||
runStatus = RunningStatusMerging | ||
} | ||
} | ||
} | ||
if runStatus == "" || tbl.Dropped() { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this and the stepping twice all feels a little brittle. is there a test which resumes between the two steps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test that randomly pauses before calls to (*schemaChanger).txn
and then resume the job and makes sure it succeeds. It's a little slow so it only does one pause by default, but locally I've run it after every txn:
--- PASS: TestPauseBeforeRandomDescTxn (23.87s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_1 (1.77s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_2 (1.78s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_3 (1.74s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_4 (1.80s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_5 (1.91s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_6 (1.72s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_7 (1.77s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_8 (1.72s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_9 (1.67s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_10 (1.82s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_11 (1.87s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_12 (1.94s)
--- PASS: TestPauseBeforeRandomDescTxn/create_index_pause_at_txn_13 (1.74s)
I think we need to add more assertions that we don't have any unexpected cruft and it would also be nice to add some concurrent operations, but hopefully this gets at the basic problem you are pointing out here.
6e7ad8b
to
36d4807
Compare
143a356
to
3d2daaf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it took me so long to actually make my way through this. My comments are largely superficial and can be made later as cleanup. Let's get this merged so we can gain experience with it
@@ -182,7 +182,12 @@ func createSchemaChangeJobsFromMutations( | |||
} | |||
spanList := make([]jobspb.ResumeSpanList, mutationCount) | |||
for i := range spanList { | |||
spanList[i] = jobspb.ResumeSpanList{ResumeSpans: []roachpb.Span{tableDesc.PrimaryIndexSpan(codec)}} | |||
mut := tableDesc.Mutations[idx+i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this might benefit from commentary explaining why it's doing what it's doing. Can be follow-up, just while I'm here, this is remarkably impactful and a bit subtle.
// For now just grab all of the destination KVs and merge the corresponding entries. | ||
kvs, err := txn.Scan(ctx, key, endKey, chunkSize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I were being nit-picky, I'd encourage you to use the lower-level APIs. These high-level APIs are sort of garbage. I'd consider something like:
var ba roachpb.BatchRequest
ba.TargetBytes = 16 << 10
ba.MaxSpanRequestKeys = chunkSize
ba.Add(&roachpb.ScanRequest{
RequestHeader: roachpb.RequestHeader{
Key: key,
EndKey: endKey,
},
ScanFormat: roachpb.KEY_VALUES,
})
br, pErr := txn.Send(ctx, ba)
if pErr != nil {
return pErr.GoError()
}
resp := br.Responses[0].GetScan()
for _, row := range resp.Rows {
}
Otherwise you're liable to pull a lot more data than you wanted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to update to this now. Was 16Kb just a bet guess or do you have some thinking there I can include in a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to push this to a follow up PR. Will open it shortly after this merges.
cb3e6be
to
236e9e8
Compare
236e9e8
to
a5a7319
Compare
Previously, the index backfilling process depended upon non-MVCC compliant AddSSTable calls which potentially rewrote previously read historical values. To support an MVCC-compliant AddSSTable that writes at the _current_ timestamp, this change implements a new backfilling process described in the following RFC: https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md In summary, the new index backfilling process depends on backfilling the new index when it is in a BACKFILLING state (added in cockroachdb#72281). In this state it receives no writes or deletes. Writes that occur during the backfilling process are captured by a "temporary index." This temporary index uses the DeletePreservingEncoding to ensure it captures deletes as well as writes. After the of bulk backfill using the MVCC-compliant AddSSTable, the index is moved into a MERGING state (added in cockroachdb#75663) in which it receives writes and deletes. Writes previously captured by the temporary index are then transactionally merged into the newly added index. This feature is currently behind a new boolean cluster setting which default to true. Schema changes that contains both old and new-style backfills are rejected. Reverting the default to false will require updating various tests since many tests depend on the exact index IDs of newly added indexes. Release note: None Co-authored-by: Rui Hu <[email protected]>
a5a7319
to
2e8dc3e
Compare
This distributes and checkpoints the index merging process. The merging process checkpoint is per temporary index. Release note: None Co-authored-by: Steven Danna <[email protected]>
2e8dc3e
to
e47317f
Compare
bors r=ajwerner |
Build succeeded: |
Previously, the index backfilling process depended upon non-MVCC
compliant AddSSTable calls which potentially rewrote previously read
historical values.
To support an MVCC-compliant AddSSTable that writes at the current
timestamp, this change implements a new backfilling process described
in the following RFC:
https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20211004_incremental_index_backfiller.md
In summary, the new index backfilling process depends on backfilling
the new index when it is in a BACKFILLING state (added in #72281). In
this state it receives no writes or deletes. Writes that occur during
the backfilling process are captured by a "temporary index." This
temporary index uses the DeletePreservingEncoding to ensure it
captures deletes as well as writes.
After the of bulk backfill using the MVCC-compliant AddSSTable, the
index is moved into a MERGING state
(added in #75663) in which it receives writes and deletes. Writes
previously captured by the temporary index are then transactionally
merged into the newly added index.
This feature is currently behind a new boolean cluster setting which
default to true. Schema changes that contains both old and new-style
backfills are rejected.
Reverting the default to false will require updating various tests
since many tests depend on the exact index IDs of newly added indexes.
Release note: None