Skip to content

Commit

Permalink
Merge #45282
Browse files Browse the repository at this point in the history
45282: import,backup,engine: re-work bulk "row" counting r=dt a=dt

Bulk operations use a low-level – c++ in some cases – row counter to
track what they process, such as how many rows are backed up, how many
bytes are imported, etc. While iterating individual KVs during an export
or import operation, a counter inspects each KV to maintain the count of
rows, index entries and total data size processed.

As mentioned above, these is done at a very low-level for performance
reasons, for example in BACKUP the entire construction of the resulting
data files is pushed all the way down to c++ so a single cgo call can
iterate directly from rocksdb into the result file. Thus this "row"
counter is a simple utility and doesn't use our higher level SQL
decoding libraries which use the actual table metadata, and instead used
a hardcoded comparison of the extracted index ID to 1 to determine if
that KV belonged to the primary key and this represented a "row".

However, with the addition of ALTER PRIMARY KEY in 20.1, this logic that
assumes ID 1 is the PK is no longer correct: the PK have ID 7 in one
table and ID 4 in another table.

To fix this the row counter now just counts how many entries it sees in
each index of each table, returning these to the caller and letting it
decide which, if any, it wants to call out as 'rows' based on its
knowledge of which indexes are primary vs secondary.

Fixes #44998.

Release note: none.

Co-authored-by: David Taylor <[email protected]>
  • Loading branch information
craig[bot] and dt committed Feb 27, 2020
2 parents 5389060 + cb20534 commit 3170612
Show file tree
Hide file tree
Showing 19 changed files with 1,426 additions and 880 deletions.
4 changes: 2 additions & 2 deletions c-deps/libroach/db.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1089,7 +1089,7 @@ DBStatus DBExportToSst(DBKey start, DBKey end, bool export_all_revisions,
DBIncrementalIterator iter(engine, iter_opts, start, end, write_intent);

roachpb::BulkOpSummary bulkop_summary;
RowCounter row_counter;
RowCounter row_counter(&bulkop_summary);

bool skip_current_key_versions = !export_all_revisions;
DBIterState state;
Expand Down Expand Up @@ -1153,7 +1153,7 @@ DBStatus DBExportToSst(DBKey start, DBKey end, bool export_all_revisions,
return status;
}

if (!row_counter.Count((iter.key()), &bulkop_summary)) {
if (!row_counter.Count(iter.key())) {
return ToDBString("Error in row counter");
}
const int64_t new_size = cur_size + decoded_key.size() + iter.value().size();
Expand Down
188 changes: 133 additions & 55 deletions c-deps/libroach/protos/roachpb/api.pb.cc

Large diffs are not rendered by default.

202 changes: 121 additions & 81 deletions c-deps/libroach/protos/roachpb/api.pb.h

Large diffs are not rendered by default.

14 changes: 10 additions & 4 deletions c-deps/libroach/row_counter.cc
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ void RowCounter::EnsureSafeSplitKey(rocksdb::Slice* key) {

// Count examines each key passed to it and increments the running count when it
// sees a key that belongs to a new row.
bool RowCounter::Count(const rocksdb::Slice& key, cockroach::roachpb::BulkOpSummary* summary) {
bool RowCounter::Count(const rocksdb::Slice& key) {
// EnsureSafeSplitKey is usually used to avoid splitting a row across ranges,
// by returning the row's key prefix.
// We reuse it here to count "rows" by counting when it changes.
Expand Down Expand Up @@ -102,10 +102,16 @@ bool RowCounter::Count(const rocksdb::Slice& key, cockroach::roachpb::BulkOpSumm
uint64_t index_id;
if (!DecodeUvarint64(&decoded_key, &index_id)) {
return false;
} else if (index_id == 1) {
summary->set_rows(summary->rows() + 1);
}

// This mirrors logic of the go function roachpb.BulkOpSummaryID.
uint64_t bulk_op_summary_id = (tbl << 32) | index_id;
(*summary->mutable_entry_counts())[bulk_op_summary_id]++;

if (index_id == 1) {
summary->set_deprecated_rows(summary->deprecated_rows() + 1);
} else {
summary->set_index_entries(summary->index_entries() + 1);
summary->set_deprecated_index_entries(summary->deprecated_index_entries() + 1);
}

return true;
Expand Down
4 changes: 3 additions & 1 deletion c-deps/libroach/row_counter.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,13 @@ const int MaxReservedDescID = 49;
// via `Count`. Note: the `DataSize` field of the BulkOpSummary is *not*
// populated by this and should be set separately.
struct RowCounter {
bool Count(const rocksdb::Slice& key, cockroach::roachpb::BulkOpSummary* summary);
RowCounter(cockroach::roachpb::BulkOpSummary* summary) : summary(summary) {}
bool Count(const rocksdb::Slice& key);

private:
void EnsureSafeSplitKey(rocksdb::Slice* key);
int GetRowPrefixLength(rocksdb::Slice* key);
cockroach::roachpb::BulkOpSummary* summary;
std::string prev_key;
rocksdb::Slice prev;
};
Loading

0 comments on commit 3170612

Please sign in to comment.