Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: rationalize the handling of comments #95431

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions pkg/sql/catalog/catalogkeys/keys.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,8 @@ type CommentType int
//go:generate stringer --type CommentType

// Note: please add the new comment types to AllCommentTypes as well.
// Note: do not change the numeric values of this enum -- they correspond
// to stored values in system.comments.
const (
// DatabaseCommentType comment on a database.
DatabaseCommentType CommentType = 0
Expand Down
36 changes: 34 additions & 2 deletions pkg/sql/catalog/descs/collection.go
Original file line number Diff line number Diff line change
Expand Up @@ -737,6 +737,11 @@ func (tc *Collection) GetAllDatabases(ctx context.Context, txn *kv.Txn) (nstree.
if err != nil {
return nstree.Catalog{}, err
}

// FIXME: here probably I want to call LoadComments() with just
// DatabaseCommentType. But I only want to do so when the caller is
// interested in comments. Not all are.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that it's fine to replace LoadComments in favour of GetByIDs, and assuming that we would in fact rather not scan the comments table so much (do we really care? those scans are going to be empty most of the time, aren't they?), what we could do is make the GetByIDs implementation smarter based on the expectedType argument: if we're expecting a database descriptor then there's no point in scanning for non-database comments, an so forth.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to do something similar for zone configs as well.


ret, err := tc.aggregateAllLayers(ctx, txn, stored)
if err != nil {
return nstree.Catalog{}, err
Expand Down Expand Up @@ -843,8 +848,14 @@ func (tc *Collection) GetAllInDatabase(
if err != nil {
return nstree.Catalog{}, err
}

// Also ensure the db desc itself is included, which ensures we can
// fetch its comment if any below.
// FIXME: this does not seem to work. Why?
ret.UpsertDescriptor(db)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the call to FilterByIDs in the final return removes the database descriptor from the result set. If you want to include the database descriptor in the GetAllInDatabase result set and thereby change its contract, then you should also include its zone config, etc. which IIRC will require an extra point lookup. For consistency's sake, you'd then have to do the same in GetAllObjectsInSchema. I'd considered it when I wrote these methods but eventually ruled against it for some reason but I don't have a fundamental objection to including the root of the hierarchy, not at all.


var inDatabaseIDs catalog.DescriptorIDSet
_ = ret.ForEachDescriptor(func(desc catalog.Descriptor) error {
if err := ret.ForEachDescriptor(func(desc catalog.Descriptor) error {
if desc.DescriptorType() == catalog.Schema {
if dbID := desc.GetParentID(); dbID != descpb.InvalidID && dbID != db.GetID() {
return nil
Expand All @@ -855,11 +866,32 @@ func (tc *Collection) GetAllInDatabase(
}
}
inDatabaseIDs.Add(desc.GetID())

// Also include all the comments for this object.
// FIXME: This does not seem to be the right place to call this --
// it should load comments into `stored` _before_ the call
// to aggregateLayers(), so that shadowed comments don't get overwritten.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And zone configs. I think your instinct is correct but I'd go even further: aggregateAllLayers needs to do for comments and zone configs what it already does for descriptors and namespace records. It already does read comments and the zone configs, in fact: it calls getDescriptorsByID which in turn calls GetByIDs for descriptors which weren't found in the uncommitted layer and others. The results of GetByIDs are cached so there'd be no harm in calling it another time towards the end of aggregateAllLayers once the descIDs set is built, and using the results to populate ret correctly.

comments, err := tc.cr.LoadComments(ctx, txn, catalogkeys.AllCommentTypes, desc.GetID())
if err != nil {
return err
}
ret.AddAll(comments)
return nil
})
}); err != nil {
return nstree.Catalog{}, err
}

return ret.FilterByIDs(inDatabaseIDs.Ordered()), nil
}

// LoadComments extends the given catalog with all comments of the given type
// associated with descriptors already in the catalog.
func (tc *Collection) LoadComments(
ctx context.Context, txn *kv.Txn, commentTypes []catalogkeys.CommentType, descID descpb.ID,
) (nstree.Catalog, error) {
return tc.cr.LoadComments(ctx, txn, commentTypes, descID)
}

// GetAllTablesInDatabase is like GetAllInDatabase but filtered to tables.
// Includes virtual objects. Does not include dropped objects.
func (tc *Collection) GetAllTablesInDatabase(
Expand Down
21 changes: 21 additions & 0 deletions pkg/sql/catalog/internal/catkv/catalog_reader.go
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,10 @@ type CatalogReader interface {
ctx context.Context, txn *kv.Txn, db catalog.DatabaseDescriptor, sc catalog.SchemaDescriptor,
) (nstree.Catalog, error)

// LoadComments returns a catalog with at least the comment of the
// given types for the given object. It may include more comments.
LoadComments(ctx context.Context, txn *kv.Txn, commentTypes []catalogkeys.CommentType, objID descpb.ID) (nstree.Catalog, error)

// GetByIDs reads the system.descriptor, system.comments and system.zone
// entries for the desired IDs, but looks in the system database cache
// first if there is one.
Expand Down Expand Up @@ -159,6 +163,23 @@ func (cr catalogReader) ScanAll(ctx context.Context, txn *kv.Txn) (nstree.Catalo
return mc.Catalog, nil
}

// LoadComments is part of the CatalogReader interface.
func (cr catalogReader) LoadComments(
ctx context.Context, txn *kv.Txn, commentTypes []catalogkeys.CommentType, objID descpb.ID,
) (nstree.Catalog, error) {
var mc nstree.MutableCatalog
cq := catalogQuery{codec: cr.codec}
err := cq.query(ctx, txn, &mc, func(codec keys.SQLCodec, b *kv.Batch) {
for _, ct := range commentTypes {
scan(ctx, b, catalogkeys.MakeObjectCommentsMetadataPrefix(codec, ct, objID))
}
})
if err != nil {
return nstree.Catalog{}, err
}
return mc.Catalog, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Serious question: what's wrong with using GetByIDs instead? AFAIK it shouldn't matter much that we're also reading system.descriptor, we're almost certainly going to be doing that anyway at some point in the transaction. I'm keen to keep this CatalogReader interface tight.


func (cr catalogReader) scanNamespace(
ctx context.Context, txn *kv.Txn, prefix roachpb.Key,
) (nstree.Catalog, error) {
Expand Down
31 changes: 31 additions & 0 deletions pkg/sql/catalog/internal/catkv/catalog_reader_cached.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/keys"
"github.com/cockroachdb/cockroach/pkg/kv"
"github.com/cockroachdb/cockroach/pkg/sql/catalog"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkeys"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/nstree"
"github.com/cockroachdb/cockroach/pkg/util/iterutil"
Expand Down Expand Up @@ -61,6 +62,7 @@ type byIDStateValue struct {
hasScanNamespaceForDatabaseEntries bool
hasScanNamespaceForDatabaseSchemas bool
hasGetDescriptorEntries bool
hasLoadComments bool
}

type byNameStateValue struct {
Expand Down Expand Up @@ -177,6 +179,7 @@ func (c *cachedCatalogReader) ScanAll(ctx context.Context, txn *kv.Txn) (nstree.
s.hasScanNamespaceForDatabaseEntries = true
s.hasScanNamespaceForDatabaseSchemas = true
s.hasGetDescriptorEntries = true
s.hasLoadComments = true
c.byIDState[id] = s
}
for ni, s := range c.byNameState {
Expand Down Expand Up @@ -288,6 +291,34 @@ func (c *cachedCatalogReader) ScanNamespaceForSchemaObjects(
return read, nil
}

// LoadComments is part of the CatalogReader interface.
func (c *cachedCatalogReader) LoadComments(
ctx context.Context, txn *kv.Txn, commentTypes []catalogkeys.CommentType, objID descpb.ID,
) (nstree.Catalog, error) {
if !c.byIDState[objID].hasLoadComments {
// Cache miss: need to retrieve the comments from underneath.
// In this case we retrieve all comment types, not just the one requested.
read, err := c.cr.LoadComments(ctx, txn, catalogkeys.AllCommentTypes, objID)
if err != nil {
return nstree.Catalog{}, err
}
if err := c.ensure(ctx, read); err != nil {
return nstree.Catalog{}, err
}
s := c.byIDState[objID]
s.hasLoadComments = true
c.byIDState[objID] = s
}

var mc nstree.MutableCatalog
c.cache.ForEachCommentOnDescriptor(objID, func(key catalogkeys.CommentKey, cmt string) error {
mc.UpsertComment(key, cmt)
return nil
})

return mc.Catalog, nil
}

// GetByIDs is part of the CatalogReader interface.
func (c *cachedCatalogReader) GetByIDs(
ctx context.Context,
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/comprules/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ go_library(
"//pkg/sql/compengine",
"//pkg/sql/lexbase",
"//pkg/sql/scanner",
"//pkg/sql/sem/catconstants",
],
)

Expand Down
50 changes: 27 additions & 23 deletions pkg/sql/comprules/rules.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/sql/compengine"
"github.com/cockroachdb/cockroach/pkg/sql/lexbase"
"github.com/cockroachdb/cockroach/pkg/sql/scanner"
"github.com/cockroachdb/cockroach/pkg/sql/sem/catconstants"
)

// GetCompMethods exposes the completion heuristics defined in this
Expand Down Expand Up @@ -257,19 +258,20 @@ func completeObjectInCurrentDatabase(

c.Trace("completing for %q (%d,%d), schema: %s", prefix, start, end, schema)
const queryT = `
WITH n AS (SELECT oid FROM pg_catalog.pg_namespace WHERE nspname %s),
t AS (SELECT oid, relname FROM pg_catalog.pg_class WHERE reltype != 0 AND relnamespace IN (TABLE n))
SELECT relname AS completion,
'relation' AS category,
substr(COALESCE(cc.comment, ''), e'[^\n]{0,80}') as description,
$2:::INT AS start,
$3:::INT AS end
FROM t
LEFT OUTER JOIN "".crdb_internal.kv_catalog_comments cc
ON t.oid = cc.object_id AND cc.type = 'TableCommentType'
WHERE left(relname, length($1:::STRING)) = $1::STRING
SELECT c.relname AS completion,
'relation' AS category,
substr(d.description, ''), e'[^\n]{0,80}') as description,
$2:::INT AS start,
$3:::INT AS end
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n
ON c.relnamespace = n.oid AND n.nspname %s
LEFT OUTER JOIN crdb_internal.kv_catalog_comments d
ON t.oid = d.objoid AND d.classoid = %d
WHERE c.reltype != 0
AND left(relname, length($1:::STRING)) = $1::STRING
`
query := fmt.Sprintf(queryT, schema)
query := fmt.Sprintf(queryT, schema, catconstants.PgCatalogClassTableID)
iter, err := c.Query(ctx, query, prefix, start, end)
return iter, err
}
Expand Down Expand Up @@ -298,17 +300,18 @@ func completeSchemaInCurrentDatabase(
}

c.Trace("completing for %q (%d,%d)", prefix, start, end)
const query = `
SELECT nspname AS completion,
'schema' AS category,
substr(COALESCE(cc.comment, ''), e'[^\n]{0,80}') as description,
$2:::INT AS start,
$3:::INT AS end
FROM pg_catalog.pg_namespace t
LEFT OUTER JOIN "".crdb_internal.kv_catalog_comments cc
ON t.oid = cc.object_id AND cc.type = 'SchemaCommentType'
const queryT = `
SELECT n.nspname AS completion,
'schema' AS category,
substr(COALESCE(d.description, ''), e'[^\n]{0,80}') as description,
$2:::INT AS start,
$3:::INT AS end
FROM pg_catalog.pg_namespace n
LEFT OUTER JOIN crdb_internal.kv_catalog_comments d
ON n.oid = d.objoid AND d.classoid = %d
WHERE left(nspname, length($1:::STRING)) = $1::STRING
`
query := fmt.Sprintf(queryT, catconstants.PgCatalogNamespaceTableID)
iter, err := c.Query(ctx, query, prefix, start, end)
return iter, err
}
Expand Down Expand Up @@ -418,7 +421,7 @@ func completeObjectInOtherDatabase(
}

c.Trace("completing for %q (%d,%d), schema: %q, db: %q", prefix, start, end, schema, dbname)
const query = `
const queryT = `
WITH t AS (
SELECT name, table_id
FROM "".crdb_internal.tables
Expand All @@ -433,8 +436,9 @@ SELECT name AS completion,
$3:::INT AS end
FROM t
LEFT OUTER JOIN "".crdb_internal.kv_catalog_comments cc
ON t.table_id = cc.object_id AND cc.type = 'TableCommentType'
ON t.table_id = cc.objoid AND cc.classoid = %d
`
query := fmt.Sprintf(queryT, catconstants.PgCatalogClassTableID)
iter, err := c.Query(ctx, query, prefix, start, end, dbname, schema)
return iter, err
}
Loading