-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: replace internalLookupCtx with descs.Collection interfaces #64673
Comments
Hi @postamar, please add a C-ategory label to your issue. Check out the label system docs. While you're here, please consider adding an A- label to help keep our repository tidy. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Relates to #64089 somewhat. The point is, there's no good reason for |
Previously, the schema changer workload could hit intermittent errors when descriptors were bound after being looked up using crdb_internal table. Unfortunately, the crdb_internal tables never lease out descriptors, so these schema could be pulled from under us. To address this, this patch will allow unknown schema errors for certain crdb_internal queries that are likely to observe this condition. Addressing issue cockroachdb#64673 would improve some of this behavior, and eliminate the need for the workaround here, since we would automatically get a retry error. Release note: None
82941: workload/schemachanger: allow unknown schema errors in certain contexts r=fqazi a=fqazi Previously, the schema changer workload could hit intermittent errors when descriptors were bound after being looked up using crdb_internal table. Unfortunately, the crdb_internal tables never lease out descriptors, so these schema could be pulled from under us. To address this, this patch will allow unknown schema errors for certain crdb_internal queries that are likely to observe this condition. Addressing issue #64673 would improve some of this behaviour, and eliminate the need for the workaround here, since we would automatically get a retry error. Release note: None Co-authored-by: Faizan Qazi <[email protected]>
We’ve made some progress on some of this. I’m going re-state some background and then get into some concrete steps. I'm certain there are important considerations I am forgetting, but this will have to serve as a start. BackgroundThe A precise lookup is a lookup which fully specifies either the fully qualified name of a descriptor, either with strings or with IDs or the ID of a descriptor. It is distinct from resolution which combines session properties and visibility rules due to privileges on top of precise lookup. The cockroach/pkg/sql/catalog/descs/collection.go Lines 79 to 108 in 7f16831
Some of these sources are very poorly named. For this project there are two we want to focus on:
|
type kvDescriptors struct { | |
codec keys.SQLCodec | |
// systemNamespace is a cache of system table namespace entries. We assume | |
// these are immutable for the life of the process. | |
systemNamespace *systemDatabaseNamespaceCache | |
// allDescriptors is a slice of all available descriptors. The descriptors | |
// are cached to avoid repeated lookups by users like virtual tables. The | |
// cache is purged whenever events would cause a scan of all descriptors to | |
// return different values, such as when the txn timestamp changes or when | |
// new descriptors are written in the txn. | |
// | |
// TODO(ajwerner): This cache may be problematic in clusters with very large | |
// numbers of descriptors. | |
allDescriptors allDescriptors | |
// allDatabaseDescriptors is a slice of all available database descriptors. | |
// These are purged at the same time as allDescriptors. | |
allDatabaseDescriptors []catalog.DatabaseDescriptor | |
// allSchemasForDatabase maps databaseID -> schemaID -> schemaName. | |
// For each databaseID, all schemas visible under the database can be | |
// observed. | |
// These are purged at the same time as allDescriptors. | |
allSchemasForDatabase map[descpb.ID]map[descpb.ID]string | |
// memAcc is the actual account of an injected, upstream monitor | |
// to track memory usage of kvDescriptors. | |
memAcc mon.BoundAccount | |
} |
One important call, for example, is (kd *kvDescriptors) getAllDescriptors
. There we check to see if we've already read all of the descriptors. If we have, we return them, otherwise, we scan the entire descriptors table and populate.
You may or may not be surprised to know that this is what powers many pg_catalog
/information_schema
/crdb_internal
virtual tables. However, before those things can utilize these descriptors, they build an auxiliary data structure: the *sql.internalLookupCtx
.
sql.InternalLookupCtx
The sql.InternalLookupCtx
is built from a set of descriptors and provides some methods use by the internal tables and elsewhere. It's defined here.
The primary place where this all comes together is: https://github.com/cockroachdb/cockroach/blob/bbbb658076b92f2b6ac97630e817bfac574a0606/pkg/sql/information_schema.go#L2365
modifiedDescriptors
The descs.Collection
has another set of descriptors it reads from the KV-store: the modifiedDescriptors
. This misnamed data structure holds descriptors which were read from the store via point lookups.
It can sometimes hold both mutable and immutable copies of descriptors. The collection offers a guarantee that when the same mutable descriptor is resolved and it is not written to the store in the meantime, then the same object in memory will be returned. The "immutable" copy which is returned at resolution time is the snapshot as of the last write.
Another note is that modifiedDescriptors
come in different levels of validation. A major reason why a descriptor is retrieved is to validate the cross-references of some other descriptor which has been retrieved. We don't want to validate the entire connected component of descriptors whenever one is read, so we only validate the descriptors directly connected to the current descriptor. In order to make the number of lookups required to perform this graph validation inexpensive, we both batch lookups and retrieve descriptors and store them in memory for use if later they are needed explicitly. At that point, cross references will be validated, but the descriptor in question won't need to be read again.
Type hydration
Another topic related to this problem area is type hydration
. When descriptors reference types, they need to have type metadata populated in those type fields. The type metadata is never serialized. We call the process of populating the type metadata in a descriptor type hydration.
One one hand, users of descriptors would very much like their descriptors to have their types hydrated. On another, the process of hydrating types in a descriptor is somewhat heavy in that it requires mutating the descriptor. For descriptors which may need to be hydrated with types of varying versions, the process of hydrating requires making copies.
In this way, the kvDescriptors
are sort of nice: they get hydrated collectively when they are constructed. Users of that slice get pre-hydrated copies and we know they were their own copy because they were fetched from the key-value store.
The uncommittedDescriptors
get hydrated when they get read via a more generic hydration path up the call stack. One oddity is that the table descriptors which are not written but were hydrated with the modified type do not get updated. In general this is fine because any change to the type descriptor must not have modified the table descriptor and so must not have affected it, but it is a little weird or perhaps unexpected.
syntheticDescriptors
This is another set of descriptors in the descs.Collection
. These descriptors are in-memory only and are intended to override the value which has currently been written or exists in the key-value store.
These are generally used today for validating constrains or indexes which have not yet been published as public. There is hope that we can use these more to defer writes to the kv store and inject behavior into the local session which differs from the behavior implied by the written version.
Problems
A problem is that the kvDescriptors
and the modifiedDescriptors
are not unified. The kvDescriptors
are unvalidated whereas the modifiedDescriptors
are validate when their mutable
or immutable
entry are populated. An even bigger problem is that to make sure that we don't have an incoherent cache, we clear the kvDescriptors
set every time any descriptor is modified. This means that in long schema migration transactions, we may re-read all of the descriptors many times.
A related problem is that due to the nature of descriptor leases, we may lease a different version of a descriptor than we subsequently read. Similarly, we may read a descriptor for kvDescriptors
and then subsequently lease it at different version. Once we read a descriptor into modifiedDescriptors
, we'll never again see a leased descriptor for that ID.
Another challenge is that not all internalLookupCtx
objects are instantiated from a descs.Collection
. In at least one case the internalLookupCtx
is instantiated from a slice of descriptors from a backup manifest. For this, I think we'll need to inject these descriptors into a descs.Collection
, perhaps using synthetic descriptors.
Goal
Extend and unify internal structure between the modifiedDescriptors
and kvDescriptors
so that the needs of the users of internalLookupCtx
can be met and so that writes to descriptors does not need to invalidate the cache. This may require some bookkeeping on which descriptors have been fully read (which can serve as a negative cache for lookups, among other things).
One challenge may be the need to interleave some iteration between these different data structures. The existing usage of |
Nice write-up, Andrew 👍 |
Wanted to clarify a bit on the Maybe by |
Just to help myself and potentially @jasonmchan as well to understand better the problems to resolve here. |
Chatted with @ajwerner that there is situation where user wants to list things from |
83675: pgwire: allow Flush message during portal exeuction r=jordanlewis a=rafiss fixes #83613 Release note (bug fix): A Flush message sent during portal execution in the pgwire extended protocol no longer results in an error. 83800: sql/builtins: tenant-related builtins require admin role r=knz a=stevendanna The following builtins now require the admin role: - crdb_internal.create_tenant - crdb_internal.destroy_tenant - crdb_internal.gc_tenant - crdb_internal.update_tenant_resource_limits Release note (ops change): Tenant-related crdb_internal functions now require the admin role to use. 83838: workload/schemachanger: stabilize schema changer workload r=fqazi a=fqazi This pull request will do the following, to help stabilize the workload under CI: 1) Temporarily disable survive/primary region-related database alters to avoid (#83831) 2) Add retry logic for unknown schema errors which are due to (#64673) Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Steven Danna <[email protected]> Co-authored-by: Faizan Qazi <[email protected]>
Previously, the collection invalidated its view of descriptors read from KV after a descriptor was modified. This was necessary because the collection does not fully buffer writes to descriptors, so the descriptors stored in KV may change. This is problematic, because users of the `kvDescriptors` (virtual tables) must perform a round trip to refresh their descriptors cache after schema changes in the same transaction. To address this, this commit avoids invalidating the cache. Instead, when serving lookups that rely on the `kvDescriptors`, we will amend the cache with the contents of `uncommittedDescriptors` to ensure that a transaction sees its own modifications. Relates to cockroachdb#64673 Release note: None
Relates to cockroachdb#64673. Previously, the collection maintained two separate sources: `uncommittedDescriptors` and `kvDescriptors`. This was confusing because `uncommittedDescriptors` was intended to contain modified descriptors, but it also held descriptors cached from KV point lookups. This commit fixes this by introducing `storedDescriptors`, which is the combination of the above sources. This commit works towards a model where `storedDescriptors` is a mirror of the descriptors in KV. These descriptors were either cached after KV reads, or were modified by the associated transaction and should be written to KV upon commit. Now, all descriptors read from storage are stored in the same btree, eliminating possible duplication between `kvDescriptors` and `uncommittedDescriptors`. As a byproduct of this change, we are able to better leverage caching due to virtual table lookups within a transaction. First, we no longer need to invalidate batches of cached descriptors after schema changes. Second, point lookups after a range lookup will properly check the descriptors cached due to the range lookup. Release note: None
Relates to cockroachdb#64673. Previously, the collection maintained two separate sources: `uncommittedDescriptors` and `kvDescriptors`. This was confusing because `uncommittedDescriptors` was intended to contain modified descriptors, but it also held descriptors cached from KV point lookups. This commit fixes this by introducing `storedDescriptors`, which is the combination of the above sources. This commit works towards a model where `storedDescriptors` is a mirror of the descriptors in KV. These descriptors were either cached after KV reads, or were modified by the associated transaction and should be written to KV upon commit. Now, all descriptors read from storage are stored in the same btree, eliminating possible duplication between `kvDescriptors` and `uncommittedDescriptors`. As a byproduct of this change, we are able to better leverage caching due to virtual table lookups within a transaction. First, we no longer need to invalidate batches of cached descriptors after schema changes. Second, point lookups after a range lookup will properly check the descriptors cached due to the range lookup. Release note: None
Relates to cockroachdb#64673. Previously, the collection maintained two separate sources: `uncommittedDescriptors` and `kvDescriptors`. This was confusing because `uncommittedDescriptors` was intended to contain modified descriptors, but it also held descriptors cached from KV point lookups. This commit fixes this by introducing `storedDescriptors`, which is the combination of the above sources. This commit works towards a model where `storedDescriptors` is a mirror of the descriptors in KV. These descriptors were either cached after KV reads, or were modified by the associated transaction and should be written to KV upon commit. Now, all descriptors read from storage are stored in the same btree, eliminating possible duplication between `kvDescriptors` and `uncommittedDescriptors`. As a byproduct of this change, we are able to better leverage caching due to virtual table lookups within a transaction. First, we no longer need to invalidate batches of cached descriptors after schema changes. Second, point lookups after a range lookup will properly check the descriptors cached due to the range lookup. Release note: None
Relates to cockroachdb#64673. Previously, the collection maintained two separate sources: `uncommittedDescriptors` and `kvDescriptors`. This was confusing because `uncommittedDescriptors` was intended to contain modified descriptors, but it also held descriptors cached from KV point lookups. This commit fixes this by introducing `storedDescriptors`, which is the combination of the above sources. This commit works towards a model where `storedDescriptors` is a mirror of the descriptors in KV. These descriptors were either cached after KV reads, or were modified by the associated transaction and should be written to KV upon commit. Now, all descriptors read from storage are stored in the same btree, eliminating possible duplication between `kvDescriptors` and `uncommittedDescriptors`. As a byproduct of this change, we are able to better leverage caching due to virtual table lookups within a transaction. First, we no longer need to invalidate batches of cached descriptors after schema changes. Second, point lookups after a range lookup will properly check the descriptors cached due to the range lookup. Release note: None
84442: descs: unify uncommitted and kv descriptors r=jasonmchan a=jasonmchan Relates to #64673. Previously, the collection maintained two separate sources: `uncommittedDescriptors` and `kvDescriptors`. This was confusing because `uncommittedDescriptors` was intended to contain modified descriptors, but it also held descriptors cached from KV point lookups. This commit fixes this by introducing `storedDescriptors`, which is the combination of the above sources. This commit works towards a model where `storedDescriptors` is a mirror of the descriptors in KV. These descriptors were either cached after KV reads, or were modified by the associated transaction and should be written to KV upon commit. Now, all descriptors read from storage are stored in the same btree, eliminating possible duplication between `kvDescriptors` and `uncommittedDescriptors`. As a byproduct of this change, we are able to better leverage caching due to virtual table lookups within a transaction. First, we no longer need to invalidate batches of cached descriptors after schema changes. Second, point lookups after a range lookup will properly check the descriptors cached due to the range lookup. New rttanalysis cases demonstrate round-trip reductions. Release note: None 84516: sql: disallow DESC option in the last column of inverted indexes r=mgartner a=mgartner The last column of inverted indexes cannot have the `DESC` direction. Prior to this commit it was allowed, but caused internal errors. To avoid propagating the notion that inverted indexes have a semantic direction, we stop printing the `ASC` option for inverted index columns in the output of `SHOW CREATE TABLE`. Fixes #84388 Release note (sql change): The last column of an INVERTED INDEX can no longer have the `DESC` option. If `DESC` was used in prior versions, it could cause internal errors. 84628: sql: remove the artifact of canceling the txn-scoped context r=yuzefovich a=yuzefovich This commit removes an old artifact of having a txn-scoped context cancellation that is performed when finishing the txn. As Andrei points out, this txn-scoped cancellation is likely a leftover from ancient times and is no longer needed. In particular, this also fixes the bug of using the span after it was finished (which would occur with high vmodule on `context.go` file). Fixes: #83739. Release note: None 84747: logictest: rename a config r=yuzefovich a=yuzefovich This commit renames `[email protected]` logic test config to `local-v1.1-at-v1.0-noupgrade` which aids with the refactoring of the logic test suite. Release note: None 84750: docs: fail to build `bnf` if not all files are declared in `OUTS` r=rail a=rickystewart Release note: None Co-authored-by: Jason Chan <[email protected]> Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
Jira issue: CRDB-7192
The text was updated successfully, but these errors were encountered: