Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: do not backup or restore stat forecasts #94992

Merged
merged 2 commits into from
Jan 13, 2023

Conversation

stevendanna
Copy link
Collaborator

@stevendanna stevendanna commented Jan 10, 2023

Statistics forecasts are not persisted and are reconstructed on demand from the persisted statistics. As such, they don't need to be included in a backup or restored.

This has the additional effect of solving a bug in which we would fail to write statistics into our metadata SST if it included more than 1 forecast since the in-memory forecasts have a 0 statistic_id.

Fixes #86806

Epic: none

Release note: None

@stevendanna stevendanna requested review from a team as code owners January 10, 2023 11:57
@stevendanna stevendanna requested review from benbardin and rytaft and removed request for a team January 10, 2023 11:57
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@stevendanna stevendanna requested a review from michae2 January 10, 2023 11:57
@stevendanna stevendanna force-pushed the skip-forecasts-in-backups branch 2 times, most recently from 55d091a to f8fab63 Compare January 10, 2023 12:02
@stevendanna stevendanna removed request for a team and rytaft January 10, 2023 12:04
@stevendanna stevendanna force-pushed the skip-forecasts-in-backups branch from f8fab63 to 5d457d9 Compare January 10, 2023 12:07
Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this! Hope this wasn't too much of an inconvenience!

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @benbardin and @stevendanna)


pkg/ccl/backupccl/backup_job.go line 376 at r1 (raw file):

				// from the persisted stats and do not
				// need to be backed up.
				if stat.Name == jobspb.ForecastStatsName {

Additionally, in v23.1 we also want to skip merged partial statistics, which can be detected with stat.Name == jobspb.MergedStatsName.


pkg/ccl/backupccl/restore_job.go line 532 at r1 (raw file):

	tableHasStatsInBackup := make(map[descpb.ID]struct{})
	for _, stat := range tableStatistics {
		if stat.Name == jobspb.ForecastStatsName {

We should also skip merged partial statistics here (stat.Name == jobspb.MergedStatsName).


pkg/ccl/backupccl/testdata/backup-restore/metadata line 34 at r1 (raw file):


# The previous bug occurs when there are two or more forecasts since
# forecasts do not hava a statistics ID.

Note that whenever #86358 is fixed, forecasted stats will have a statistics ID. (And same for merged partial stats, eventually.)

@stevendanna stevendanna force-pushed the skip-forecasts-in-backups branch from 5d457d9 to 59e3627 Compare January 11, 2023 10:51
@stevendanna
Copy link
Collaborator Author

@michae2 Good call on the merged stats as those also pose a problem. Fixing that up actually exposed another potential problem. It seems like we aren't backing up partial statistics because they aren't returned from (*TableStatisticsCache).GetTableStats.

How would you feel about a new method on the stats cache that was something like GetTableStatsForBackup?

@stevendanna
Copy link
Collaborator Author

Restoring partial statistics might require us to complicate the code for restoring statistics a bit since partial statistics reference full statistics and those references would need to be fixed up on restore.

This seems like something we could do in a follow-up issue.

@stevendanna
Copy link
Collaborator Author

I opened #95056 to track the other issue.

@stevendanna stevendanna requested a review from michae2 January 11, 2023 14:29
@michae2
Copy link
Collaborator

michae2 commented Jan 11, 2023

@michae2 Good call on the merged stats as those also pose a problem. Fixing that up actually exposed another potential problem. It seems like we aren't backing up partial statistics because they aren't returned from (*TableStatisticsCache).GetTableStats.

How would you feel about a new method on the stats cache that was something like GetTableStatsForBackup?

I opened #95056 to track the other issue.

Ah, good catch. I think we will eventually want the optimizer to see partial stats via GetTableStats anyway, so let me quickly fix that.

Restoring partial statistics might require us to complicate the code for restoring statistics a bit since partial statistics reference full statistics and those references would need to be fixed up on restore.

This seems like something we could do in a follow-up issue.

Yes, this is a problem. Also a problem for ALTER TABLE ... INJECT STATISTICS. One idea is to change InsertNewStat to also insert the StatisticID. I can open an issue about this.

Copy link
Collaborator

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Thank you!

Reviewed 3 of 3 files at r2, 1 of 1 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @benbardin and @stevendanna)


pkg/ccl/backupccl/backup_job.go line 426 at r4 (raw file):

	// stats on demand and do not need to be backed up or
	// restored.
	if stat.Name == jobspb.ForecastStatsName || stat.Name == jobspb.MergedStatsName {

nit: This could be simplified to return stat.Name != jobspb.ForecastStatsName && stat.Name != jobspb.MergedStatsName.

@stevendanna stevendanna force-pushed the skip-forecasts-in-backups branch from 68b3f84 to 9b89a3b Compare January 12, 2023 09:08
Statistics forecasts are not persisted and are reconstructed on demand
from the persisted statistics. As such, they don't need to be included
in a backup or restored.

This has the additional effect of solving a bug in which we would fail
to write statistics into our metadata SST if it included more than 1
forecast since the in-memory forecasts have a 0 statistic_id.

Fixes cockroachdb#86806

Epic: none

Release note: None
Merged statistics are generated from partial statistics and the
related full statistic. Partial statistics are not currently included
in the backup, but will be soon, thus we are excluding merged
statistics from the backup.

Release note: None
@stevendanna stevendanna force-pushed the skip-forecasts-in-backups branch from 9b89a3b to c8060fd Compare January 12, 2023 14:26
@stevendanna
Copy link
Collaborator Author

bors r=michae2

@craig
Copy link
Contributor

craig bot commented Jan 13, 2023

Build succeeded:

@craig craig bot merged commit 582d845 into cockroachdb:master Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

backupccl: write table stats in order in backup metadata sst
3 participants