Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
60775: sql: descriptor validation overhaul r=postamar a=postamar

    sql: descriptor validation overhaul
    
    Previously, descriptor validation suffered from a few shortcomings:
    1. The methods to call were inconsistent across descriptor types, and
       the error messages they return are inconsistent as well.
    2. Those methods made ad-hoc use of catalog.DescGetter to walk through
       the descriptor reference graph, and could sometimes walk surprisingly
       far. This is particularly true for type descriptors, which hold
       references to other types.
    3. To complicate things further, DescGetter implementations which read
       descriptors from storage themselves perform validation as well.
       Although it is perfectly sensible to systematically validate
       descriptors when read, there is a circularity problem at hand which
       would benefit from being solved without melting the maintainer's
       brain in the process.
    4. The validation methods return an error type, making it awkward to
       return multiple errors. Tools like doctor would be made more useful
       if they could report more than only the first encountered error.

    Recently we introduced a change that added descriptor validation on
    write. This change involves validating all uncommitted descriptors in
    bulk and this in turn favours a bulk approach to reading all their
    referenced descriptors from storage.
    
    With this in mind, this commit adds a GetReferencedDescIDs method to
    catalog.Descriptor which returns the IDs of all descriptors referenced
    by the receiver. By calling this method on all descriptors we intend to
    validate and by reading all referenced descriptors in one kv.Batch and
    stuffing them in a catalog.MapDescGetter, we can now validate using this
    in-memory DescGetter instead.
    
    Turning validation into this multi-step process means it can no longer
    be invoked as a method call on the target descriptor. This commit moves
    the entry point to descriptor validation to the catalog.Validate
    function. The descriptor objects themselves still define the validation
    checks themselves but these are effectively made inaccessible from
    outside the catalog package. The rationale behind this is to enforce
    order in the sequence of checks performed and to enforce some uniformity
    in the formatting of the error messages. The checks are grouped into
    three tiers: internal consistency checks, cross-reference consistency
    checks, and transactional consistency checks.
    
    All this also helps reduce the number of call sites of descriptor
    validations as well as increase the scope of the validations. This has
    uncovered a few bugs related to schemas and temporary tables. This
    effort has also helped identify validation checks which have since been
    made redundant, either by other existing rules or by the new
    validation-on-write behaviour. Finally, this has allowed us to strip
    doctor of its own duplicated (and dodgy) validation logic and simply
    report all validation errors instead.
    
    Release justification: bug fixes and low-risk updates to new
    functionality (descriptor validation on write).
    
    Release note (cli change): The doctor tool can now report multiple
    descriptor validation failures per descriptor.

----

*Follow-up work*

- `WriteDescriptors` in `pkg/ccl/backupccl/restore_job.go`: should it validate schemas and types also? It seems like it should. When I add those, tests fail. I didn't look into it further. Probably out of scope for this change.
- ~Add linter check to enforce the use of `catalog.Validate` and `catalog.ValidateSelf`? It seems all too easy to call the `ValidateSelf` method on a `catalog.Descriptor` without being aware that it returns an `[]error` instead of an `error`.~ (edit: I since changed the way validation errors were collected, making a linter check unnecessary)
- Remove privileges surgery in `ValidateSelf` methods in favour of `NewFilledInImmutable`-like constructors. I don't like that validation functionality changes the underlying objects. (edit: this is actually really hard, and dangerous! giving up on this for now)
- ~I feel like my validation error wrapping is an improvement to the incoherent messaging that prevailed before but I'm not sure what's best to put in the prefix message. I settled on `errors.Wrapf(err, "%s %q", desc.TypeName(), desc.GetName())` but it could be anything else really. Do we prefer reporting names or IDs? I also don't know whether error message changes are considered to be contract-breaking, worthy of release note mentions, interfering with stability period, etc.~(edit: done like Andrew suggested below)

*Review guide*

The meat and potatoes of the change is in `pkg/sql/catalog/*.go`, best start reviewing from there, moving on to the `pkg/sql/catalog/*desc` packages. Notice that I moved the validation logic in `tabledesc` out of `structured.go` into a new file, `validate.go`. The diff in `structured.go` was going to be messy anyways despite there not being too many significant underlying changes: it's just calling `Report` with the errors, moving stuff around, removing redundancies, breaking down massive functions into smaller chunks, etc. The code still basically does the same thing. I moved tests cases around also.

Co-authored-by: Marius Posta <[email protected]>
  • Loading branch information
craig[bot] and Marius Posta committed Mar 1, 2021
2 parents 91a58f4 + 4d9034c commit 8a31338
Show file tree
Hide file tree
Showing 74 changed files with 4,373 additions and 3,632 deletions.
3 changes: 0 additions & 3 deletions pkg/base/test_server_args.go
Original file line number Diff line number Diff line change
Expand Up @@ -128,9 +128,6 @@ type TestServerArgs struct {

// IF set, the demo login endpoint will be enabled.
EnableDemoLoginEndpoint bool

// If set, testing specific descriptor validation will be disabled. even if the server
DisableTestingDescriptorValidation bool
}

// TestClusterArgs contains the parameters one can set when creating a test
Expand Down
6 changes: 3 additions & 3 deletions pkg/ccl/backupccl/backupbase/targets.go
Original file line number Diff line number Diff line change
Expand Up @@ -190,9 +190,9 @@ func NewDescriptorResolver(descs []catalog.Descriptor) (*DescriptorResolver, err
if !ok {
return errors.Errorf("schema %d not found for desc %d", scID, desc.GetID())
}
scDesc, ok := scDescI.(catalog.SchemaDescriptor)
if !ok {
return errors.Errorf("descriptor %d is not a schema", scDescI.GetID())
scDesc, err := catalog.AsSchemaDescriptor(scDescI)
if err != nil {
return err
}
scName = scDesc.GetName()
}
Expand Down
19 changes: 6 additions & 13 deletions pkg/ccl/backupccl/restore_job.go
Original file line number Diff line number Diff line change
Expand Up @@ -433,23 +433,16 @@ func WriteDescriptors(
}
return err
}
// TODO(ajwerner): Utilize validation inside of the descs.Collection
// rather than reaching into the store.
dg := catalogkv.NewOneLevelUncachedDescGetter(txn, codec)

bdg := catalogkv.NewOneLevelUncachedDescGetter(txn, codec)
descs := make([]catalog.Descriptor, 0, len(databases)+len(tables))
for _, table := range tables {
if err := table.Validate(ctx, dg); err != nil {
return errors.Wrapf(err,
"validate table %d", errors.Safe(table.GetID()))
}
descs = append(descs, table)
}

for _, db := range databases {
if err := db.Validate(ctx, dg); err != nil {
return errors.Wrapf(err,
"validate database %d", errors.Safe(db.GetID()))
}
descs = append(descs, db)
}
return nil
return catalog.ValidateSelfAndCrossReferences(ctx, bdg, descs...)
}()
return errors.Wrapf(err, "restoring table desc and namespace entries")
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/ccl/changefeedccl/avro_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ func parseTableDesc(createTableStmt string) (catalog.TableDescriptor, error) {
if err != nil {
return nil, err
}
return mutDesc, mutDesc.ValidateSelf(ctx)
return mutDesc, catalog.ValidateSelf(mutDesc)
}

func parseValues(tableDesc catalog.TableDescriptor, values string) ([]rowenc.EncDatumRow, error) {
Expand Down
4 changes: 0 additions & 4 deletions pkg/ccl/importccl/import_stmt.go
Original file line number Diff line number Diff line change
Expand Up @@ -1280,10 +1280,6 @@ func writeNonDropDatabaseChange(

queuedJob := []jobspb.JobID{job.ID()}
b := txn.NewBatch()
dg := catalogkv.NewOneLevelUncachedDescGetter(txn, p.ExecCfg().Codec)
if err := desc.Validate(ctx, dg); err != nil {
return nil, err
}
err = descsCol.WriteDescToBatch(
ctx,
p.ExtendedEvalContext().Tracing.KVTracingEnabled(),
Expand Down
2 changes: 1 addition & 1 deletion pkg/ccl/logictestccl/testdata/logic_test/multi_region
Original file line number Diff line number Diff line change
Expand Up @@ -911,7 +911,7 @@ SHOW ENUMS FROM drop_region_db
schema name values owner
public crdb_internal_region {ap-southeast-2,ca-central-1} root

statement error pq: region "us-east-1" has not been added to database "drop_region_db"
statement error pq: relation "t" \([0-9]+\): invalid locality config: region "us-east-1" has not been added to database "drop_region_db"
CREATE TABLE drop_region_db.public.t(a int) LOCALITY REGIONAL BY TABLE IN "us-east-1"

statement ok
Expand Down
1 change: 1 addition & 0 deletions pkg/ccl/partitionccl/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ go_test(
"//pkg/server",
"//pkg/settings/cluster",
"//pkg/sql",
"//pkg/sql/catalog",
"//pkg/sql/catalog/catalogkv",
"//pkg/sql/catalog/descpb",
"//pkg/sql/catalog/tabledesc",
Expand Down
3 changes: 2 additions & 1 deletion pkg/ccl/partitionccl/partition_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/server"
"github.com/cockroachdb/cockroach/pkg/settings/cluster"
"github.com/cockroachdb/cockroach/pkg/sql"
"github.com/cockroachdb/cockroach/pkg/sql/catalog"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc"
"github.com/cockroachdb/cockroach/pkg/sql/parser"
Expand Down Expand Up @@ -139,7 +140,7 @@ func (pt *partitioningTest) parse() error {
return err
}
pt.parsed.tableDesc = mutDesc
if err := pt.parsed.tableDesc.ValidateSelf(ctx); err != nil {
if err := catalog.ValidateSelf(pt.parsed.tableDesc); err != nil {
return err
}
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/cli/testdata/doctor/testcluster
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ doctor cluster
----
debug doctor cluster
Examining 35 descriptors and 36 namespace entries...
Table 53: ParentID 50, ParentSchemaID 29, Name 'foo': not being dropped but no namespace entry found
ParentID 50, ParentSchemaID 29: relation "foo" (53): not being dropped but no namespace entry found
Examining 1 running jobs...
ERROR: validation failed
12 changes: 6 additions & 6 deletions pkg/cli/testdata/doctor/testzipdir
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@ doctor zipdir
----
debug doctor zipdir testdata/doctor/debugzip
Examining 38 descriptors and 43 namespace entries...
Table 53: ParentID 52, ParentSchemaID 29, Name 'users': desc 53: parentID 52 does not exist
Table 54: ParentID 52, ParentSchemaID 29, Name 'vehicles': desc 54: parentID 52 does not exist
Table 55: ParentID 52, ParentSchemaID 29, Name 'rides': desc 55: parentID 52 does not exist
Table 56: ParentID 52, ParentSchemaID 29, Name 'vehicle_location_histories': desc 56: parentID 52 does not exist
Table 57: ParentID 52, ParentSchemaID 29, Name 'promo_codes': desc 57: parentID 52 does not exist
Table 58: ParentID 52, ParentSchemaID 29, Name 'user_promo_codes': desc 58: parentID 52 does not exist
ParentID 52, ParentSchemaID 29: relation "users" (53): referenced database ID 52: descriptor not found
ParentID 52, ParentSchemaID 29: relation "vehicles" (54): referenced database ID 52: descriptor not found
ParentID 52, ParentSchemaID 29: relation "rides" (55): referenced database ID 52: descriptor not found
ParentID 52, ParentSchemaID 29: relation "vehicle_location_histories" (56): referenced database ID 52: descriptor not found
ParentID 52, ParentSchemaID 29: relation "promo_codes" (57): referenced database ID 52: descriptor not found
ParentID 52, ParentSchemaID 29: relation "user_promo_codes" (58): referenced database ID 52: descriptor not found
Descriptor 52: has namespace row(s) [{ParentID:0 ParentSchemaID:0 Name:movr}] but no descriptor
Examining 1 running jobs...
job 587337426984566785: schema change GC refers to missing table descriptor(s) [59]
Expand Down
3 changes: 0 additions & 3 deletions pkg/server/testserver.go
Original file line number Diff line number Diff line change
Expand Up @@ -274,9 +274,6 @@ func makeTestConfigFromParams(params base.TestServerArgs) Config {
if params.Knobs.SQLExecutor == nil {
cfg.TestingKnobs.SQLExecutor = &sql.ExecutorTestingKnobs{}
}
if !params.DisableTestingDescriptorValidation {
cfg.TestingKnobs.SQLExecutor.(*sql.ExecutorTestingKnobs).TestingDescriptorValidation = true
}

// For test servers, leave interleaved tables enabled by default. We'll remove
// this when we remove interleaved tables altogether.
Expand Down
2 changes: 1 addition & 1 deletion pkg/sql/alter_column_type_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -404,7 +404,7 @@ ALTER TABLE t.test ALTER COLUMN x TYPE STRING;

<-childJobStartNotification

expected := "pq: unimplemented: cannot perform a schema change operation while an ALTER COLUMN TYPE schema change is in progress"
expected := `pq: relation "test" \(53\): unimplemented: cannot perform a schema change operation while an ALTER COLUMN TYPE schema change is in progress`
sqlDB.ExpectErr(t, expected, `
ALTER TABLE t.test ADD COLUMN y INT;
`)
Expand Down
10 changes: 1 addition & 9 deletions pkg/sql/alter_database.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ import (
"github.com/cockroachdb/cockroach/pkg/keys"
"github.com/cockroachdb/cockroach/pkg/security"
"github.com/cockroachdb/cockroach/pkg/sql/catalog"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/dbdesc"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/typedesc"
Expand Down Expand Up @@ -219,8 +218,7 @@ func (n *alterDatabaseAddRegionNode) startExec(params runParams) error {

// Validate the type descriptor after the changes. We have to do this explicitly here, because
// we're using an internal call to addEnumValue above which doesn't perform validation.
dg := catalogkv.NewOneLevelUncachedDescGetter(params.p.txn, params.ExecCfg().Codec)
if err := typeDesc.Validate(params.ctx, dg); err != nil {
if err := validateDescriptor(params.ctx, params.p, typeDesc); err != nil {
return err
}

Expand Down Expand Up @@ -522,12 +520,6 @@ func (n *alterDatabasePrimaryRegionNode) switchPrimaryRegion(params runParams) e
return err
}

// Validate the type descriptor after the changes.
dg := catalogkv.NewOneLevelUncachedDescGetter(params.p.txn, params.ExecCfg().Codec)
if err := typeDesc.Validate(params.ctx, dg); err != nil {
return err
}

// Update the database's zone configuration.
if err := ApplyZoneConfigFromDatabaseRegionConfig(
params.ctx,
Expand Down
22 changes: 9 additions & 13 deletions pkg/sql/alter_table.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ import (
"github.com/cockroachdb/cockroach/pkg/security"
"github.com/cockroachdb/cockroach/pkg/server/telemetry"
"github.com/cockroachdb/cockroach/pkg/sql/catalog"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/resolver"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/schemaexpr"
Expand Down Expand Up @@ -281,7 +280,7 @@ func (n *alterTableNode) startExec(params runParams) error {
case *tree.CheckConstraintTableDef:
var err error
params.p.runWithOptions(resolveFlags{contextDatabaseID: n.tableDesc.ParentID}, func() {
info, infoErr := n.tableDesc.GetConstraintInfo(params.ctx, nil)
info, infoErr := n.tableDesc.GetConstraintInfo()
if infoErr != nil {
err = infoErr
return
Expand Down Expand Up @@ -663,14 +662,13 @@ func (n *alterTableNode) startExec(params runParams) error {
return pgerror.Newf(pgcode.ObjectNotInPrerequisiteState,
"column %q in the middle of being added, try again later", t.Column)
}
if err := n.tableDesc.Validate(
params.ctx, catalogkv.NewOneLevelUncachedDescGetter(params.p.Txn(), params.ExecCfg().Codec),
); err != nil {

if err := validateDescriptor(params.ctx, params.p, n.tableDesc); err != nil {
return err
}

case *tree.AlterTableDropConstraint:
info, err := n.tableDesc.GetConstraintInfo(params.ctx, nil)
info, err := n.tableDesc.GetConstraintInfo()
if err != nil {
return err
}
Expand All @@ -692,14 +690,12 @@ func (n *alterTableNode) startExec(params runParams) error {
return err
}
descriptorChanged = true
if err := n.tableDesc.Validate(
params.ctx, catalogkv.NewOneLevelUncachedDescGetter(params.p.Txn(), params.ExecCfg().Codec),
); err != nil {
if err := validateDescriptor(params.ctx, params.p, n.tableDesc); err != nil {
return err
}

case *tree.AlterTableValidateConstraint:
info, err := n.tableDesc.GetConstraintInfo(params.ctx, nil)
info, err := n.tableDesc.GetConstraintInfo()
if err != nil {
return err
}
Expand Down Expand Up @@ -889,7 +885,7 @@ func (n *alterTableNode) startExec(params runParams) error {
descriptorChanged = descriptorChanged || descChanged

case *tree.AlterTableRenameConstraint:
info, err := n.tableDesc.GetConstraintInfo(params.ctx, nil)
info, err := n.tableDesc.GetConstraintInfo()
if err != nil {
return err
}
Expand Down Expand Up @@ -1106,7 +1102,7 @@ func applyColumnMutation(
}
}

info, err := tableDesc.GetConstraintInfo(params.ctx, nil)
info, err := tableDesc.GetConstraintInfo()
if err != nil {
return err
}
Expand Down Expand Up @@ -1141,7 +1137,7 @@ func applyColumnMutation(
"constraint in the middle of being dropped")
}
}
info, err := tableDesc.GetConstraintInfo(params.ctx, nil)
info, err := tableDesc.GetConstraintInfo()
if err != nil {
return err
}
Expand Down
23 changes: 6 additions & 17 deletions pkg/sql/alter_table_locality.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ import (
"fmt"

"github.com/cockroachdb/cockroach/pkg/kv"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/colinfo"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/dbdesc"
"github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb"
Expand Down Expand Up @@ -135,7 +134,7 @@ func (n *alterTableSetLocalityNode) alterTableLocalityGlobalToRegionalByTable(

// Finalize the alter by writing a new table descriptor and updating the zone
// configuration.
if err := n.validateAndWriteNewTableLocalityAndZoneConfig(
if err := n.writeNewTableLocalityAndZoneConfig(
params,
n.dbDesc,
); err != nil {
Expand Down Expand Up @@ -167,7 +166,7 @@ func (n *alterTableSetLocalityNode) alterTableLocalityRegionalByTableToGlobal(

// Finalize the alter by writing a new table descriptor and updating the zone
// configuration.
if err := n.validateAndWriteNewTableLocalityAndZoneConfig(
if err := n.writeNewTableLocalityAndZoneConfig(
params,
n.dbDesc,
); err != nil {
Expand Down Expand Up @@ -208,7 +207,7 @@ func (n *alterTableSetLocalityNode) alterTableLocalityRegionalByTableToRegionalB
}

// Finalize the alter by writing a new table descriptor and updating the zone configuration.
if err := n.validateAndWriteNewTableLocalityAndZoneConfig(
if err := n.writeNewTableLocalityAndZoneConfig(
params,
n.dbDesc,
); err != nil {
Expand Down Expand Up @@ -539,21 +538,11 @@ func (n *alterTableSetLocalityNode) startExec(params runParams) error {
})
}

// validateAndWriteNewTableLocalityAndZoneConfig validates the newly updated
// LocalityConfig in a table descriptor, writes that table descriptor, and
// writes a new zone configuration for the given table.
func (n *alterTableSetLocalityNode) validateAndWriteNewTableLocalityAndZoneConfig(
// writeNewTableLocalityAndZoneConfig writes the table descriptor with the newly
// updated LocalityConfig and writes a new zone configuration for the table.
func (n *alterTableSetLocalityNode) writeNewTableLocalityAndZoneConfig(
params runParams, dbDesc *dbdesc.Immutable,
) error {
// Validate the new locality before updating the table descriptor.
dg := catalogkv.NewOneLevelUncachedDescGetter(params.p.txn, params.EvalContext().Codec)
if err := n.tableDesc.ValidateTableLocalityConfig(
params.ctx,
dg,
); err != nil {
return err
}

// Write out the table descriptor update.
if err := params.p.writeSchemaChange(
params.ctx,
Expand Down
6 changes: 0 additions & 6 deletions pkg/sql/alter_type.go
Original file line number Diff line number Diff line change
Expand Up @@ -126,12 +126,6 @@ func (n *alterTypeNode) startExec(params runParams) error {
return err
}

// Validate the type descriptor after the changes.
dg := catalogkv.NewOneLevelUncachedDescGetter(params.p.txn, params.ExecCfg().Codec)
if err := n.desc.Validate(params.ctx, dg); err != nil {
return err
}

if !eventLogDone {
// Write a log event.
if err := params.p.logEvent(params.ctx,
Expand Down
1 change: 1 addition & 0 deletions pkg/sql/catalog/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ go_library(
"errors.go",
"table_col_map.go",
"table_col_set.go",
"validate.go",
],
importpath = "github.com/cockroachdb/cockroach/pkg/sql/catalog",
visibility = ["//visibility:public"],
Expand Down
Loading

0 comments on commit 8a31338

Please sign in to comment.