sql: replace internalLookupCtx with descs.Collection interfaces #64673

postamar · 2021-05-04T18:49:16Z

blathers-crl · 2021-05-04T18:49:18Z

Hi @postamar, please add a C-ategory label to your issue. Check out the label system docs.

While you're here, please consider adding an A- label to help keep our repository tidy.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

postamar · 2022-03-11T16:39:46Z

Relates to #64089 somewhat. The point is, there's no good reason for internalLookupCtx to exist anymore.

Previously, the schema changer workload could hit intermittent errors when descriptors were bound after being looked up using crdb_internal table. Unfortunately, the crdb_internal tables never lease out descriptors, so these schema could be pulled from under us. To address this, this patch will allow unknown schema errors for certain crdb_internal queries that are likely to observe this condition. Addressing issue cockroachdb#64673 would improve some of this behavior, and eliminate the need for the workaround here, since we would automatically get a retry error. Release note: None

82941: workload/schemachanger: allow unknown schema errors in certain contexts r=fqazi a=fqazi Previously, the schema changer workload could hit intermittent errors when descriptors were bound after being looked up using crdb_internal table. Unfortunately, the crdb_internal tables never lease out descriptors, so these schema could be pulled from under us. To address this, this patch will allow unknown schema errors for certain crdb_internal queries that are likely to observe this condition. Addressing issue #64673 would improve some of this behaviour, and eliminate the need for the workaround here, since we would automatically get a retry error. Release note: None Co-authored-by: Faizan Qazi <[email protected]>

ajwerner · 2022-06-28T00:54:21Z

We’ve made some progress on some of this. I’m going re-state some background and then get into some concrete steps. I'm certain there are important considerations I am forgetting, but this will have to serve as a start.

Background

The descs.Collection is, in general, the narrow-band for the catalog. The descs.Collection can be thought of as the session-scoped accessor to the catalog for “precise” lookups.

A precise lookup is a lookup which fully specifies either the fully qualified name of a descriptor, either with strings or with IDs or the ID of a descriptor. It is distinct from resolution which combines session properties and visibility rules due to privileges on top of precise lookup.

The Collection internally reads descriptors from a number of sources:

cockroach/pkg/sql/catalog/descs/collection.go

Lines 79 to 108 in 7f16831

    
           // virtualSchemas optionally holds the virtual schemas. 
        
           virtual virtualDescriptors 
        
           // A collection of descriptors valid for the timestamp. They are released once 
        
           // the transaction using them is complete. 
        
           leased leasedDescriptors 
        
           // Descriptors modified by the uncommitted transaction affiliated with this 
        
           // Collection. This allows a transaction to see its own modifications while 
        
           // bypassing the descriptor lease mechanism. The lease mechanism will have its 
        
           // own transaction to read the descriptor and will hang waiting for the 
        
           // uncommitted changes to the descriptor if this transaction is PRIORITY HIGH. 
        
           // These descriptors are local to this Collection and their state is thus not 
        
           // visible to other transactions. 
        
           uncommitted uncommittedDescriptors 
        
           // A collection of descriptors which were read from the store. 
        
           kv kvDescriptors 
        
           // syntheticDescriptors contains in-memory descriptors which override all 
        
           // other matching descriptors during immutable descriptor resolution (by name 
        
           // or by ID), but should not be written to disk. These support internal 
        
           // queries which need to use a special modified descriptor (e.g. validating 
        
           // non-public schema elements during a schema change). Attempting to resolve 
        
           // a mutable descriptor by name or ID when a matching synthetic descriptor 
        
           // exists is illegal. 
        
           synthetic syntheticDescriptors 
        
           // temporary contains logic to access temporary schema descriptors. 
        
           temporary temporaryDescriptors

Some of these sources are very poorly named. For this project there are two we want to focus on: kvDescriptors and uncommittedDescriptors.

`kvDescriptors`

In kvDescriptors we store descriptors which were read from the key-value store in large batches. These methods are predominantly used when listing all descriptors or all descriptors in some context.

cockroach/pkg/sql/catalog/descs/kv_descriptors.go

Lines 32 to 62 in 7f16831

    
           type kvDescriptors struct { 
        
           	codec keys.SQLCodec 
        
           	// systemNamespace is a cache of system table namespace entries. We assume 
        
           	// these are immutable for the life of the process. 
        
           	systemNamespace *systemDatabaseNamespaceCache 
        
           	// allDescriptors is a slice of all available descriptors. The descriptors 
        
           	// are cached to avoid repeated lookups by users like virtual tables. The 
        
           	// cache is purged whenever events would cause a scan of all descriptors to 
        
           	// return different values, such as when the txn timestamp changes or when 
        
           	// new descriptors are written in the txn. 
        
           	// 
        
           	// TODO(ajwerner): This cache may be problematic in clusters with very large 
        
           	// numbers of descriptors. 
        
           	allDescriptors allDescriptors 
        
           	// allDatabaseDescriptors is a slice of all available database descriptors. 
        
           	// These are purged at the same time as allDescriptors. 
        
           	allDatabaseDescriptors []catalog.DatabaseDescriptor 
        
           	// allSchemasForDatabase maps databaseID -> schemaID -> schemaName. 
        
           	// For each databaseID, all schemas visible under the database can be 
        
           	// observed. 
        
           	// These are purged at the same time as allDescriptors. 
        
           	allSchemasForDatabase map[descpb.ID]map[descpb.ID]string 
        
           	// memAcc is the actual account of an injected, upstream monitor 
        
           	// to track memory usage of kvDescriptors. 
        
           	memAcc mon.BoundAccount 
        
           }

One important call, for example, is (kd *kvDescriptors) getAllDescriptors. There we check to see if we've already read all of the descriptors. If we have, we return them, otherwise, we scan the entire descriptors table and populate.

You may or may not be surprised to know that this is what powers many pg_catalog/information_schema/crdb_internal virtual tables. However, before those things can utilize these descriptors, they build an auxiliary data structure: the *sql.internalLookupCtx.

`sql.InternalLookupCtx`

The sql.InternalLookupCtx is built from a set of descriptors and provides some methods use by the internal tables and elsewhere. It's defined here.

The primary place where this all comes together is: https://github.com/cockroachdb/cockroach/blob/bbbb658076b92f2b6ac97630e817bfac574a0606/pkg/sql/information_schema.go#L2365

`modifiedDescriptors`

The descs.Collection has another set of descriptors it reads from the KV-store: the modifiedDescriptors. This misnamed data structure holds descriptors which were read from the store via point lookups.

It can sometimes hold both mutable and immutable copies of descriptors. The collection offers a guarantee that when the same mutable descriptor is resolved and it is not written to the store in the meantime, then the same object in memory will be returned. The "immutable" copy which is returned at resolution time is the snapshot as of the last write.

Another note is that modifiedDescriptors come in different levels of validation. A major reason why a descriptor is retrieved is to validate the cross-references of some other descriptor which has been retrieved. We don't want to validate the entire connected component of descriptors whenever one is read, so we only validate the descriptors directly connected to the current descriptor. In order to make the number of lookups required to perform this graph validation inexpensive, we both batch lookups and retrieve descriptors and store them in memory for use if later they are needed explicitly. At that point, cross references will be validated, but the descriptor in question won't need to be read again.

Type hydration

Another topic related to this problem area is type hydration. When descriptors reference types, they need to have type metadata populated in those type fields. The type metadata is never serialized. We call the process of populating the type metadata in a descriptor type hydration.

One one hand, users of descriptors would very much like their descriptors to have their types hydrated. On another, the process of hydrating types in a descriptor is somewhat heavy in that it requires mutating the descriptor. For descriptors which may need to be hydrated with types of varying versions, the process of hydrating requires making copies.

In this way, the kvDescriptors are sort of nice: they get hydrated collectively when they are constructed. Users of that slice get pre-hydrated copies and we know they were their own copy because they were fetched from the key-value store.

The uncommittedDescriptors get hydrated when they get read via a more generic hydration path up the call stack. One oddity is that the table descriptors which are not written but were hydrated with the modified type do not get updated. In general this is fine because any change to the type descriptor must not have modified the table descriptor and so must not have affected it, but it is a little weird or perhaps unexpected.

`syntheticDescriptors`

This is another set of descriptors in the descs.Collection. These descriptors are in-memory only and are intended to override the value which has currently been written or exists in the key-value store.

These are generally used today for validating constrains or indexes which have not yet been published as public. There is hope that we can use these more to defer writes to the kv store and inject behavior into the local session which differs from the behavior implied by the written version.

Problems

A problem is that the kvDescriptors and the modifiedDescriptors are not unified. The kvDescriptors are unvalidated whereas the modifiedDescriptors are validate when their mutable or immutable entry are populated. An even bigger problem is that to make sure that we don't have an incoherent cache, we clear the kvDescriptors set every time any descriptor is modified. This means that in long schema migration transactions, we may re-read all of the descriptors many times.

A related problem is that due to the nature of descriptor leases, we may lease a different version of a descriptor than we subsequently read. Similarly, we may read a descriptor for kvDescriptors and then subsequently lease it at different version. Once we read a descriptor into modifiedDescriptors, we'll never again see a leased descriptor for that ID.

Another challenge is that not all internalLookupCtx objects are instantiated from a descs.Collection. In at least one case the internalLookupCtx is instantiated from a slice of descriptors from a backup manifest. For this, I think we'll need to inject these descriptors into a descs.Collection, perhaps using synthetic descriptors.

Goal

Extend and unify internal structure between the modifiedDescriptors and kvDescriptors so that the needs of the users of internalLookupCtx can be met and so that writes to descriptors does not need to invalidate the cache. This may require some bookkeeping on which descriptors have been fully read (which can serve as a negative cache for lookups, among other things).

ajwerner · 2022-06-28T14:00:06Z

One challenge may be the need to interleave some iteration between these different data structures. The existing usage of google/btree and its functional iteration style will make this tricky. We can replace the backing data structure if it's in the way.

postamar · 2022-06-28T16:52:28Z

Nice write-up, Andrew 👍

chengxiong-ruan · 2022-06-29T21:47:27Z

Wanted to clarify a bit on the modifiedDescriptor, it is from the the mutation visitor in declarative schema changer and populated when a descriptor is checked out for a schema change or as a reference dependency. Though it's true that things in it are from Collection and are validated.

Maybe by modifiedDescriptor we actually meanuncommittedDescriptors above.

chengxiong-ruan · 2022-06-29T22:14:37Z

Just to help myself and potentially @jasonmchan as well to understand better the problems to resolve here.
why we could re-read all of the descriptors many times in one schema change transaction, is there any schema change behavior require reading all descriptors? My understand is that GetAllDescriptor is only for listing information which requires all descriptors, like tables in crdb_internal.

chengxiong-ruan · 2022-07-06T18:19:40Z

Just to help myself and potentially @jasonmchan as well to understand better the problems to resolve here. why we could re-read all of the descriptors many times in one schema change transaction, is there any schema change behavior require reading all descriptors? My understand is that GetAllDescriptor is only for listing information which requires all descriptors, like tables in crdb_internal.

Chatted with @ajwerner that there is situation where user wants to list things from crdb_internal tables (which require reading all descriptors) after a schema change. For example you could add a column, and then want to list out table columns info from crdb_internal.table_columns table (it reads all descriptors, and all descriptor is used to constructed internalLookupCtx), and then you might repeat similar behaviors in the same transaction. Since we release the all descriptors data every time we write a schema change, every time we read from crdb_internal virtual tables all descriptors are reloaded again. One goal of this project is to make sure we do read all descriptors once instead of doing it many times.

83675: pgwire: allow Flush message during portal exeuction r=jordanlewis a=rafiss fixes #83613 Release note (bug fix): A Flush message sent during portal execution in the pgwire extended protocol no longer results in an error. 83800: sql/builtins: tenant-related builtins require admin role r=knz a=stevendanna The following builtins now require the admin role: - crdb_internal.create_tenant - crdb_internal.destroy_tenant - crdb_internal.gc_tenant - crdb_internal.update_tenant_resource_limits Release note (ops change): Tenant-related crdb_internal functions now require the admin role to use. 83838: workload/schemachanger: stabilize schema changer workload r=fqazi a=fqazi This pull request will do the following, to help stabilize the workload under CI: 1) Temporarily disable survive/primary region-related database alters to avoid (#83831) 2) Add retry logic for unknown schema errors which are due to (#64673) Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Steven Danna <[email protected]> Co-authored-by: Faizan Qazi <[email protected]>

Previously, the collection invalidated its view of descriptors read from KV after a descriptor was modified. This was necessary because the collection does not fully buffer writes to descriptors, so the descriptors stored in KV may change. This is problematic, because users of the `kvDescriptors` (virtual tables) must perform a round trip to refresh their descriptors cache after schema changes in the same transaction. To address this, this commit avoids invalidating the cache. Instead, when serving lookups that rely on the `kvDescriptors`, we will amend the cache with the contents of `uncommittedDescriptors` to ensure that a transaction sees its own modifications. Relates to cockroachdb#64673 Release note: None

Relates to cockroachdb#64673. Previously, the collection maintained two separate sources: `uncommittedDescriptors` and `kvDescriptors`. This was confusing because `uncommittedDescriptors` was intended to contain modified descriptors, but it also held descriptors cached from KV point lookups. This commit fixes this by introducing `storedDescriptors`, which is the combination of the above sources. This commit works towards a model where `storedDescriptors` is a mirror of the descriptors in KV. These descriptors were either cached after KV reads, or were modified by the associated transaction and should be written to KV upon commit. Now, all descriptors read from storage are stored in the same btree, eliminating possible duplication between `kvDescriptors` and `uncommittedDescriptors`. As a byproduct of this change, we are able to better leverage caching due to virtual table lookups within a transaction. First, we no longer need to invalidate batches of cached descriptors after schema changes. Second, point lookups after a range lookup will properly check the descriptors cached due to the range lookup. Release note: None

84442: descs: unify uncommitted and kv descriptors r=jasonmchan a=jasonmchan Relates to #64673. Previously, the collection maintained two separate sources: `uncommittedDescriptors` and `kvDescriptors`. This was confusing because `uncommittedDescriptors` was intended to contain modified descriptors, but it also held descriptors cached from KV point lookups. This commit fixes this by introducing `storedDescriptors`, which is the combination of the above sources. This commit works towards a model where `storedDescriptors` is a mirror of the descriptors in KV. These descriptors were either cached after KV reads, or were modified by the associated transaction and should be written to KV upon commit. Now, all descriptors read from storage are stored in the same btree, eliminating possible duplication between `kvDescriptors` and `uncommittedDescriptors`. As a byproduct of this change, we are able to better leverage caching due to virtual table lookups within a transaction. First, we no longer need to invalidate batches of cached descriptors after schema changes. Second, point lookups after a range lookup will properly check the descriptors cached due to the range lookup. New rttanalysis cases demonstrate round-trip reductions. Release note: None 84516: sql: disallow DESC option in the last column of inverted indexes r=mgartner a=mgartner The last column of inverted indexes cannot have the `DESC` direction. Prior to this commit it was allowed, but caused internal errors. To avoid propagating the notion that inverted indexes have a semantic direction, we stop printing the `ASC` option for inverted index columns in the output of `SHOW CREATE TABLE`. Fixes #84388 Release note (sql change): The last column of an INVERTED INDEX can no longer have the `DESC` option. If `DESC` was used in prior versions, it could cause internal errors. 84628: sql: remove the artifact of canceling the txn-scoped context r=yuzefovich a=yuzefovich This commit removes an old artifact of having a txn-scoped context cancellation that is performed when finishing the txn. As Andrei points out, this txn-scoped cancellation is likely a leftover from ancient times and is no longer needed. In particular, this also fixes the bug of using the span after it was finished (which would occur with high vmodule on `context.go` file). Fixes: #83739. Release note: None 84747: logictest: rename a config r=yuzefovich a=yuzefovich This commit renames `[email protected]` logic test config to `local-v1.1-at-v1.0-noupgrade` which aids with the refactoring of the logic test suite. Release note: None 84750: docs: fail to build `bnf` if not all files are declared in `OUTS` r=rail a=rickystewart Release note: None Co-authored-by: Jason Chan <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>

postamar assigned ajwerner May 4, 2021

postamar added A-schema-changes A-schema-descriptors Relating to SQL table/db descriptor handling. and removed A-schema-changes labels May 4, 2021

jlinder added the T-sql-schema-deprecated Use T-sql-foundations instead label Jun 16, 2021

postamar unassigned ajwerner Mar 11, 2022

ajwerner mentioned this issue Mar 16, 2022

catkv: public and temp schemas should be included in nstree.Catalog returned by query #77160

Closed

ajwerner mentioned this issue Apr 9, 2022

sql: some schema changes use incorrect mutable descriptor #79704

Closed

fqazi mentioned this issue Jun 15, 2022

workload/schemachanger: allow unknown schema errors in certain contexts #82941

Merged

chengxiong-ruan assigned chengxiong-ruan and jasonmchan and unassigned chengxiong-ruan Jun 29, 2022

fqazi mentioned this issue Jul 5, 2022

workload/schemachanger: stabilize schema changer workload #83838

Merged

postamar mentioned this issue Jul 6, 2022

sql: internalLookupCtx should be constructed using a collection #83919

Closed

jasonmchan mentioned this issue Jul 11, 2022

descs: don't invalidate kvDescriptors cache #84187

Closed

jasonmchan mentioned this issue Jul 14, 2022

descs: unify uncommitted and kv descriptors #84442

Merged

exalate-issue-sync bot unassigned jasonmchan Oct 24, 2022

postamar closed this as completed Nov 10, 2022

exalate-issue-sync bot added T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) and removed T-sql-schema-deprecated Use T-sql-foundations instead labels May 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: replace internalLookupCtx with descs.Collection interfaces #64673

sql: replace internalLookupCtx with descs.Collection interfaces #64673

postamar commented May 4, 2021 •

edited by exalate-issue-sync bot

Loading

blathers-crl bot commented May 4, 2021

postamar commented Mar 11, 2022

ajwerner commented Jun 28, 2022

ajwerner commented Jun 28, 2022

postamar commented Jun 28, 2022

chengxiong-ruan commented Jun 29, 2022 •

edited

Loading

chengxiong-ruan commented Jun 29, 2022 •

edited

Loading

chengxiong-ruan commented Jul 6, 2022 •

edited

Loading

sql: replace internalLookupCtx with descs.Collection interfaces #64673

sql: replace internalLookupCtx with descs.Collection interfaces #64673

Comments

postamar commented May 4, 2021 • edited by exalate-issue-sync bot Loading

blathers-crl bot commented May 4, 2021

postamar commented Mar 11, 2022

ajwerner commented Jun 28, 2022

Background

kvDescriptors

sql.InternalLookupCtx

modifiedDescriptors

Type hydration

syntheticDescriptors

Problems

Goal

ajwerner commented Jun 28, 2022

postamar commented Jun 28, 2022

chengxiong-ruan commented Jun 29, 2022 • edited Loading

chengxiong-ruan commented Jun 29, 2022 • edited Loading

chengxiong-ruan commented Jul 6, 2022 • edited Loading

postamar commented May 4, 2021 •

edited by exalate-issue-sync bot

Loading

`kvDescriptors`

`sql.InternalLookupCtx`

`modifiedDescriptors`

`syntheticDescriptors`

chengxiong-ruan commented Jun 29, 2022 •

edited

Loading

chengxiong-ruan commented Jun 29, 2022 •

edited

Loading

chengxiong-ruan commented Jul 6, 2022 •

edited

Loading