Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: display up to 10 missing files in SHOW BACKUP .. with check_files #82274

Merged
merged 1 commit into from
Jun 7, 2022

Conversation

msbutler
Copy link
Collaborator

@msbutler msbutler commented Jun 1, 2022

Previously, SHOW BACKUP WITH check_files displayed the first missing SST.
This patch will display up to 100 missing SSTs. Further, this renames the
misleading approximateTablePhysicalSize to approximateSpanPhysicalSize.
Below I write out how physical table size is calculated:

  1. Each range we backup maps to 1 to many spans (currently in the
    backup_manfest.files object).

  2. 1 to many spans get written to an SST. No span will get written to multiple
    SSTs.

  3. When backup created these spans, it tried really hard to split spans at
    table boundaries, so only one table’s data could be in a span, but a last
    minute table creation makes this near impossible, due to slow range splits.
    A big table will have many spans.

  4. To compute the approximate logical size (called size_bytes in SHOW BACKUP)
    of each table, we sum the logical bytes over all it’s spans. We identify a
    table’s span by checking the table prefix of the first key in the span. See
    getTableSizes method)

  5. To compute the physical size (file_bytes in SHOW BACKUP) of a span, compute
    the logical size of each SST by summing the logical bytes in the SST over its
    spans (see getLogicalSSTSize method), and attribute a portion of the physical
    SST size (returned from cloud storage) to a span using the formula:
    (sstPhysicalSize) * (logicalSpanSize) / (logicalSSTSize) = physicalSpanSize (
    the approximateSpanTableSize method implements this).

  6. To compute the physical size of a table, sum over the physical sizes the
    table’s spans

This patch cleans up code from #80491

Release note (sql change): SHOW BACKUP WITH check_files will display up to 10
missing SSTs.

@msbutler msbutler requested a review from dt June 1, 2022 14:16
@msbutler msbutler self-assigned this Jun 1, 2022
@msbutler msbutler requested a review from a team June 1, 2022 14:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@msbutler msbutler force-pushed the butler-show-check-return-files branch 3 times, most recently from 3438a3c to 5b5d819 Compare June 7, 2022 13:09
pkg/ccl/backupccl/show.go Outdated Show resolved Hide resolved
pkg/ccl/backupccl/show.go Outdated Show resolved Hide resolved
pkg/ccl/backupccl/show.go Outdated Show resolved Hide resolved
pkg/ccl/backupccl/show.go Outdated Show resolved Hide resolved
pkg/ccl/backupccl/show.go Show resolved Hide resolved
…ck_files`

Previously, `SHOW BACKUP WITH check_files` displayed the first missing SST.
This patch will display up to 100 missing SSTs. Further, this renames the
misleading `approximateTablePhysicalSize` to `approximateSpanPhysicalSize`.
Below I write out how physical table size is calculated:

1. Each range we backup maps to 1 to many spans (currently in the
backup_manfest.files object).

2. 1 to many spans get written to an SST. No span will get written to multiple
SSTs.

3. When backup created these spans, it tried really hard to split spans at
table boundaries, so only one table’s data could be in a span, but a last
minute table creation makes this near impossible, due to slow range splits.
A big table will have many spans.

4. To compute the approximate logical size (called size_bytes in SHOW BACKUP)
of each table, we sum the logical bytes over all it’s spans. We identify a
table’s span by checking the table prefix of the first key in the span. See
getTableSizes method)

5. To compute the physical size (file_bytes in SHOW BACKUP) of a span, compute
the logical size of each SST by summing the logical bytes in the SST  over its
spans (see getLogicalSSTSize method), and attribute a portion of the physical
SST size (returned from cloud storage) to a span  using the formula:
(sstPhysicalSize) * (logicalSpanSize) / (logicalSSTSize) = physicalSpanSize (
the approximateSpanTableSize method implements this).

6. To compute the physical size of a table, sum over the physical sizes the
table’s spans

Release note (sql change): SHOW BACKUP WITH check_files will display up to 10
missing SSTs.
@msbutler msbutler force-pushed the butler-show-check-return-files branch from b17760d to 509e4a0 Compare June 7, 2022 18:36
@msbutler
Copy link
Collaborator Author

msbutler commented Jun 7, 2022

bors r=dt

@craig craig bot merged commit 0c8f826 into cockroachdb:master Jun 7, 2022
@craig
Copy link
Contributor

craig bot commented Jun 7, 2022

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants