Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catalog: remove unneeded descriptor PostDeserializationChanges and clone descriptors lazily #105382

Closed

Conversation

rafiss
Copy link
Collaborator

@rafiss rafiss commented Jun 22, 2023

A profile showed that the most expensive part of reading a large number
of descriptors is the fact that we clone each one unconditionally.

image

See individual commits.


catalog: remove unneeded table descriptor PostDeserializationChanges

#81310 included an upgrade
job that ran post deserialization changes on all descriptors, as part of
the v22.2 upgrade. That, combined with the fact that we only need to
support restoring descriptors from one major version ago, means that we
can remove all these fixes and backport the removal to 23.1.


catprivilege: remove unneeded privilege fixing logic


tabledesc: avoid cloning proto in PostDeserialization if possible

A profile showed that the most expensive part of reading a large number
of descriptors is the fact that we clone each one unconditionally.
However, most of the time, no changes are needed, so there's no need to
clone it.


tabledesc: remove unneeded changes from RunRestoreChanges

Since we only support restoring a backup from one previous major version
ago, we can remove this code.


scpb: make MigrateDescriptorState clone descriptor lazily

When running RunRestoreChanges, it is expensive to clone all
the descriptors. This refactor makes it so we only clone the descriptor
if it needs to be modified.


dbdesc: remove unneeded PostDeserializationChanges

Remove the changes that are no longer needed since all descriptors were
fixed in the v22.1 upgrade.

For the ones that remain, change them to lazily clone the protobuf as
needed, rather than doing so unconditionally. Cloning is expensive, and
shows up in profiles when reading large numbers of descriptors.


catalog: lazily clone descriptor when setting ModificationTime

Since cloning the descriptor is expensive, we should only clone them if
a modification needs to be made.

Previous commits already made the change for tables and databases, so
this one covers functions, types, and schemas.


Epic: None
Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss added the backport-23.1.x Flags PRs that need to be backported to 23.1 label Jun 22, 2023
@rafiss rafiss force-pushed the lazy-clone-during-post-deserialization branch from 7a67d56 to 459d22e Compare June 23, 2023 17:16
@rafiss rafiss removed the backport-23.1.x Flags PRs that need to be backported to 23.1 label Jun 23, 2023
@rafiss rafiss force-pushed the lazy-clone-during-post-deserialization branch 2 times, most recently from 9ec9009 to 83a7853 Compare June 26, 2023 17:49
rafiss added 7 commits June 27, 2023 01:37
cockroachdb#81310 included an upgrade
job that ran post deserialization changes on all descriptors, as part of
the v22.2 upgrade. That, combined with the fact that we only need to
support restoring descriptors from one major version ago, means that we
can remove all these fixes and backport the removal to 23.1.

Release note: None
cockroachdb#81310 included an upgrade
job that ran post deserialization changes on all descriptors, as part of
the v22.2 upgrade. That, combined with the fact that we only need to
support restoring descriptors from one major version ago, means that we
can remove this privilege fixing logic and backport the removal to 23.1.

Release note: None
A profile showed that the most expensive part of reading a large number
of descriptors is the fact that we clone each one unconditionally.
However, most of the time, no changes are needed, so there's no need to
clone it.

Release note: None
Since we only support restoring a backup from one previous major version
ago, we can remove this code.

Release note: None
When running RunRestoreChanges, it is expensive to clone all
the descriptors. This refactor makes it so we only clone the descriptor
if it needs to be modified.

Release note: None
Remove the changes that are no longer needed since all descriptors were
fixed in the v22.1 upgrade.

For the ones that remain, change them to lazily clone the protobuf as
needed, rather than doing so unconditionally. Cloning is expensive, and
shows up in profiles when reading large numbers of descriptors.

Release note: None
Since cloning the descriptor is expensive, we should only clone them if
a modification needs to be made.

Previous commits already made the change for tables and databases, so
this one covers functions, types, and schemas.

Release note: None
@rafiss rafiss force-pushed the lazy-clone-during-post-deserialization branch from 83a7853 to 5d2b59c Compare June 27, 2023 05:37
@rafiss
Copy link
Collaborator Author

rafiss commented Aug 3, 2024

closing in favor of #127028

@rafiss rafiss closed this Aug 3, 2024
@rafiss rafiss deleted the lazy-clone-during-post-deserialization branch August 7, 2024 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants