sql/catalog/descs: use allDescriptors as a negative cache for by-ID lookups #61820

ajwerner · 2021-03-11T00:07:30Z

When doing introspection queries we tend to pull and cache the complete
set of descriptors. This powers subsequent calls to various virtual
tables. However, we also often need to grab descriptors by ID. This
happens in some of the virtual indexes and it especially happens in
the call to pg_table_is_visible. When that function was called on
all of the oids in pg_class it was forced to make a bunch of kv lookups;
namely one for each and every index and virtual table (depending on
the query plan). We can short-circuit all of that with a map of descriptors
we are already holding in memory.

Before this change, the django introspection query was doing O(indexes) lookups
and now it is doing 2.

Release note (performance improvement): Reduce the number of round-trips
required to call pg_table_is_visible in the context of pg_catalog
queries.

cockroach-teamcity · 2021-03-11T00:07:37Z

This change is

rafiss

Before this change, the django introspection query was doing O(indexes) lookups
and now it is doing 2.

hmm did something change? when i first added the django introspection benchmark, the number of lookups was constant in the number of tables

change looks good though! just had small comments and confusion on what made the number of lookups increase for pg_table_is_visible

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)

pkg/sql/catalog/descs/collection.go, line 1029 at r1 (raw file):

		}

		if tc.allDescriptors != nil {

should this be tc.allDescriptorsByID != nil?

pkg/sql/catalog/descs/collection.go, line 1030 at r1 (raw file):

		if tc.allDescriptors != nil {
			if _, exists := tc.allDescriptorsByID[id]; !exists {

since the descriptor value is never used, could this be a map[id]struct{} to save memory?

also, which makes me think, do we need memory accounting on these caches? (would be an issue to resolve in the future, not in this PR)

rafiss · 2021-03-11T07:04:56Z

oh i guess the roundtrips increased because of #61493 ? i am less confused now!

pkg/sql/catalog/descs/collection.go

…ookups When doing introspection queries we tend to pull and cache the complete set of descriptors. This powers subsequent calls to various virtual tables. However, we also often need to grab descriptors by ID. This happens in some of the virtual indexes and it especially happens in the call to `pg_table_is_visible`. When that function was called on all of the oids in pg_class it was forced to make a bunch of kv lookups; namely one for each and every index and virtual table (depending on the query plan). We can short-circuit all of that with a map of descriptors we are already holding in memory. Release note (performance improvement): Reduce the number of round-trips required to call `pg_table_is_visible` in the context of pg_catalog queries.

ajwerner

I crafted a little abstraction around all of this. This should be backportable. I also sprinkled some TODO that I very much am planning on doing in the 21.2 cycle.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

ajwerner

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @rafiss)

pkg/sql/catalog/descs/collection.go, line 188 at r1 (raw file):

Previously, postamar (Marius Posta) wrote…

Gratuitous bikeshed: I believe @rafiss 's comments could be addressed by getting rid of allDescriptors altogether and using allDescriptorsByID instead. This means rewriting GetAllDescriptors so that it returns an ordered slice, which should be fine. If it isn't, replace allDescriptorsByID with a catalog.DescriptorIDSet perhaps? Duplicate sources-of-truth make me uncomfortable, I guess.

in due time I'll rework all of this but for now I'm trying to do something minimal for this release.

pkg/sql/catalog/descs/collection.go, line 1030 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

since the descriptor value is never used, could this be a map[id]struct{} to save memory?

also, which makes me think, do we need memory accounting on these caches? (would be an issue to resolve in the future, not in this PR)

Given that the descriptors and very big, I don't care that much about one byte vs one word vs two words. I changed it to an int which is a word. 7 bytes per descriptor when descriptors are hundreds doesn't feel very compelling to me right now.

rafiss

lgtm! is a backport to 21.1 needed too?

ajwerner · 2021-03-15T04:24:01Z

TFTR!

bors r+

craig · 2021-03-15T05:06:54Z

Build succeeded:

GitHub CI (Cockroach)

ajwerner requested review from rafiss and postamar March 11, 2021 00:07

rafiss requested changes Mar 11, 2021

View reviewed changes

postamar reviewed Mar 11, 2021

View reviewed changes

pkg/sql/catalog/descs/collection.go Outdated Show resolved Hide resolved

ajwerner force-pushed the ajwerner/negative-cache-by-id-with-get-all-descriptors branch from 74bb820 to 7574bfd Compare March 11, 2021 17:28

ajwerner commented Mar 11, 2021

View reviewed changes

postamar approved these changes Mar 11, 2021

View reviewed changes

rafiss approved these changes Mar 12, 2021

View reviewed changes

ajwerner added the backport-21.1.x label Mar 15, 2021

craig bot merged commit f3c8f6e into cockroachdb:master Mar 15, 2021

ajwerner mentioned this pull request Mar 16, 2021

bench/ddl_analysis: fix subtests #62038

Merged

rafiss added this to the 21.1 milestone Apr 22, 2021

ajwerner mentioned this pull request May 27, 2021

release-21.1: sql/catalog/descs: use allDescriptors as a negative cache for by-ID lookups #65807

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql/catalog/descs: use allDescriptors as a negative cache for by-ID lookups #61820

sql/catalog/descs: use allDescriptors as a negative cache for by-ID lookups #61820

ajwerner commented Mar 11, 2021

cockroach-teamcity commented Mar 11, 2021

rafiss left a comment

rafiss commented Mar 11, 2021

ajwerner left a comment

ajwerner left a comment

rafiss left a comment

ajwerner commented Mar 15, 2021

craig bot commented Mar 15, 2021

sql/catalog/descs: use allDescriptors as a negative cache for by-ID lookups #61820

sql/catalog/descs: use allDescriptors as a negative cache for by-ID lookups #61820

Conversation

ajwerner commented Mar 11, 2021

cockroach-teamcity commented Mar 11, 2021

rafiss left a comment

Choose a reason for hiding this comment

rafiss commented Mar 11, 2021

ajwerner left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

rafiss left a comment

Choose a reason for hiding this comment

ajwerner commented Mar 15, 2021

craig bot commented Mar 15, 2021