Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: descriptor validation overhaul #60775

Merged
merged 2 commits into from
Mar 1, 2021

Conversation

postamar
Copy link
Contributor

@postamar postamar commented Feb 19, 2021

sql: descriptor validation overhaul

Previously, descriptor validation suffered from a few shortcomings:
1. The methods to call were inconsistent across descriptor types, and
   the error messages they return are inconsistent as well.
2. Those methods made ad-hoc use of catalog.DescGetter to walk through
   the descriptor reference graph, and could sometimes walk surprisingly
   far. This is particularly true for type descriptors, which hold
   references to other types.
3. To complicate things further, DescGetter implementations which read
   descriptors from storage themselves perform validation as well.
   Although it is perfectly sensible to systematically validate
   descriptors when read, there is a circularity problem at hand which
   would benefit from being solved without melting the maintainer's
   brain in the process.
4. The validation methods return an error type, making it awkward to
   return multiple errors. Tools like doctor would be made more useful
   if they could report more than only the first encountered error.

Recently we introduced a change that added descriptor validation on
write. This change involves validating all uncommitted descriptors in
bulk and this in turn favours a bulk approach to reading all their
referenced descriptors from storage.

With this in mind, this commit adds a GetReferencedDescIDs method to
catalog.Descriptor which returns the IDs of all descriptors referenced
by the receiver. By calling this method on all descriptors we intend to
validate and by reading all referenced descriptors in one kv.Batch and
stuffing them in a catalog.MapDescGetter, we can now validate using this
in-memory DescGetter instead.

Turning validation into this multi-step process means it can no longer
be invoked as a method call on the target descriptor. This commit moves
the entry point to descriptor validation to the catalog.Validate
function. The descriptor objects themselves still define the validation
checks themselves but these are effectively made inaccessible from
outside the catalog package. The rationale behind this is to enforce
order in the sequence of checks performed and to enforce some uniformity
in the formatting of the error messages. The checks are grouped into
three tiers: internal consistency checks, cross-reference consistency
checks, and transactional consistency checks.

All this also helps reduce the number of call sites of descriptor
validations as well as increase the scope of the validations. This has
uncovered a few bugs related to schemas and temporary tables. This
effort has also helped identify validation checks which have since been
made redundant, either by other existing rules or by the new
validation-on-write behaviour. Finally, this has allowed us to strip
doctor of its own duplicated (and dodgy) validation logic and simply
report all validation errors instead.

Release justification: bug fixes and low-risk updates to new
functionality (descriptor validation on write).

Release note (cli change): The doctor tool can now report multiple
descriptor validation failures per descriptor.

Follow-up work

  • WriteDescriptors in pkg/ccl/backupccl/restore_job.go: should it validate schemas and types also? It seems like it should. When I add those, tests fail. I didn't look into it further. Probably out of scope for this change.
  • Add linter check to enforce the use of catalog.Validate and catalog.ValidateSelf? It seems all too easy to call the ValidateSelf method on a catalog.Descriptor without being aware that it returns an []error instead of an error. (edit: I since changed the way validation errors were collected, making a linter check unnecessary)
  • Remove privileges surgery in ValidateSelf methods in favour of NewFilledInImmutable-like constructors. I don't like that validation functionality changes the underlying objects. (edit: this is actually really hard, and dangerous! giving up on this for now)
  • I feel like my validation error wrapping is an improvement to the incoherent messaging that prevailed before but I'm not sure what's best to put in the prefix message. I settled on errors.Wrapf(err, "%s %q", desc.TypeName(), desc.GetName()) but it could be anything else really. Do we prefer reporting names or IDs? I also don't know whether error message changes are considered to be contract-breaking, worthy of release note mentions, interfering with stability period, etc.(edit: done like Andrew suggested below)

Review guide

The meat and potatoes of the change is in pkg/sql/catalog/*.go, best start reviewing from there, moving on to the pkg/sql/catalog/*desc packages. Notice that I moved the validation logic in tabledesc out of structured.go into a new file, validate.go. The diff in structured.go was going to be messy anyways despite there not being too many significant underlying changes: it's just calling Report with the errors, moving stuff around, removing redundancies, breaking down massive functions into smaller chunks, etc. The code still basically does the same thing. I moved tests cases around also.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@postamar postamar changed the title [WIP] Add desc validation on write followup [WIP] sql: add desc validation on write followup Feb 19, 2021
@postamar postamar force-pushed the add-desc-validation-followup branch 3 times, most recently from 48e5b47 to 8066079 Compare February 22, 2021 20:13
@postamar postamar changed the title [WIP] sql: add desc validation on write followup sql: descriptor validation overhaul Feb 22, 2021
@postamar postamar force-pushed the add-desc-validation-followup branch 2 times, most recently from 1647661 to 10e83e0 Compare February 22, 2021 20:46
@ajwerner
Copy link
Contributor

Add linter check to enforce the use of catalog.Validate and catalog.ValidateSelf? It seems all too easy to call the ValidateSelf method on a catalog.Descriptor without being aware that it returns an []error instead of an error.

This is interesting. I wonder if we should have it return a single error but have the implementation join the errors using errors.CombineErrors

Remove privileges surgery in ValidateSelf methods in favour of NewFilledInImmutable-like constructors. I don't like that validation functionality changes the underlying objects.

👍

I also don't know whether error message changes are considered to be contract-breaking, worthy of release note mentions, interfering with stability period, etc.

No, our contracts, when they exist, are pgcode oriented.

I feel like my validation error wrapping is an improvement to the incoherent messaging that prevailed before but I'm not sure what's best to put in the prefix message. I settled on errors.Wrapf(err, "%s %q", desc.TypeName(), desc.GetName()) but it could be anything else really. Do we prefer reporting names or IDs? I also don't know whether error message changes are considered to be contract-breaking, worthy of release note mentions, interfering with stability period, etc.

Ideally we'd have fully qualified names, but, of course, at this layer we can't. I think i'd prefer both a name and ID. How about errors.Wrapf(err, "%s %q (%d)", desc.TypeName(), desc.GetName(), desc.GetID())

@postamar postamar force-pushed the add-desc-validation-followup branch 3 times, most recently from 2a0c424 to f7d7844 Compare February 23, 2021 23:04
@postamar postamar marked this pull request as ready for review February 24, 2021 00:23
@postamar postamar requested a review from a team February 24, 2021 00:23
@postamar postamar requested a review from a team as a code owner February 24, 2021 00:23
@postamar postamar requested review from miretskiy and a team and removed request for a team February 24, 2021 00:23
@postamar
Copy link
Contributor Author

This is ready for another look.

@@ -128,9 +128,6 @@ type TestServerArgs struct {

// IF set, the demo login endpoint will be enabled.
EnableDemoLoginEndpoint bool

// If set, testing specific descriptor validation will be disabled. even if the server
DisableTestingDescriptorValidation bool
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I'm effectively reverting dd90439, that framework is not actually used in any way.

@@ -222,8 +221,7 @@ func (n *alterDatabaseAddRegionNode) startExec(params runParams) error {

// Validate the type descriptor after the changes. We have to do this explicitly here, because
// we're using an internal call to addEnumValue above which doesn't perform validation.
dg := catalogkv.NewOneLevelUncachedDescGetter(params.p.txn, params.ExecCfg().Codec)
if err := typeDesc.Validate(params.ctx, dg); err != nil {
if err := validateDescriptor(params.ctx, params.p, typeDesc); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if err := typeDesc.Validate(params.ctx, dg); err != nil {
return err
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I believe this was made redundant by validation-on-write.

@@ -281,7 +280,7 @@ func (n *alterTableNode) startExec(params runParams) error {
case *tree.CheckConstraintTableDef:
var err error
params.p.runWithOptions(resolveFlags{contextDatabaseID: n.tableDesc.ParentID}, func() {
info, infoErr := n.tableDesc.GetConstraintInfo(params.ctx, nil)
info, infoErr := n.tableDesc.GetConstraintInfo()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I changed the signatures of GetConstraintInfo and associated methods to cut down on passing nil DescGetters. https://github.com/cockroachdb/cockroach/pull/60775/files#diff-e971105bc05467ef391df9792202d3a8b712d269b5f193a0d94b700414ef281eL180

@@ -135,7 +134,7 @@ func (n *alterTableSetLocalityNode) alterTableLocalityGlobalToRegionalByTable(

// Finalize the alter by writing a new table descriptor and updating the zone
// configuration.
if err := n.validateAndWriteNewTableLocalityAndZoneConfig(
if err := n.writeNewTableLocalityAndZoneConfig(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: validation made implicit by validate-on-write.

if err := n.desc.Validate(params.ctx, dg); err != nil {
return err
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: validation made implicit by validate-on-write.

if *ptr == nil {
*ptr = &PrivilegeDescriptor{}
}
p := *ptr
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this was done mainly to accomodate tests. I think it's also nicer though.

@@ -223,7 +224,7 @@ func (m *randomStreamClient) getDescriptorAndNamespaceKVForTableID(
IngestionDatabaseID,
tableID,
fmt.Sprintf(RandomStreamSchemaPlaceholder, tableName),
&descpb.PrivilegeDescriptor{},
descpb.NewDefaultPrivilegeDescriptor(security.RootUserName()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I'll revert this, I committed this by mistake.


// Validate is like ValidateSelf but with additional cross-reference checks.
Validate(ctx context.Context, descGetter DescGetter) error
// ValidateSelfAndCrossReferences performs cross-reference checks.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I'll fix this typo.

log.VEventf(ctx, 2, "planner getting immutable descriptor for id %d", id)
return tc.getDescriptorByID(ctx, txn, id, flags, false /* mutable */)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is used by the repair functions.

@@ -75,16 +71,18 @@ var ErrIndexGCMutationsList = errors.New("index in GC mutations list")
// given TableDescriptor with the cluster version being the zero table. This
// is for a table that is created in the transaction.
func NewCreatedMutable(tbl descpb.TableDescriptor) *Mutable {
return &Mutable{wrapper: wrapper{TableDescriptor: tbl}}
m, _ := NewFilledInExistingMutable(context.TODO(), nil /* DescGetter */, false /* skipFKsWithMissingTable */, &tbl)
return &Mutable{wrapper: m.wrapper}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this is mainly to make sure that the privileges are set from the get-go.

@@ -1659,26 +1280,19 @@ func FormatTableLocalityConfig(c *descpb.TableDescriptor_LocalityConfig, f *tree
return nil
}

// ValidateTableLocalityConfig validates whether the descriptor's locality
// validateTableLocalityConfig validates whether the descriptor's locality
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I forgot to move this to validate.go, will do.

// Validate performs ValidateSelf and then validates that
// each reference to another table is resolvable and that the necessary back
// references exist.
func (desc *wrapper) Validate(ctx context.Context, descGetter catalog.DescGetter) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I moved this and the other validation functions to a new source file validate.go. The diffs were pretty gnarly anyway and as it turns out this also helps slim down structured.go which is far too big a source file.

@@ -137,1459 +133,6 @@ func TestAllocateIDs(t *testing.T) {
t.Fatalf("expected %s, but found %s", a, b)
}
}
func TestValidateDatabaseDesc(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: moved this to dbdesc.

@@ -0,0 +1,1207 @@
// Copyright 2021 The Cockroach Authors.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: here lives the table validation logic now. Unlike for type descriptors the table validation logic itself didn't need a lot of changes so I sincerely don't think this should impede review.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really nice. I like the error reporter idea; it's much more flexible.

pkg/sql/catalog/dbdesc/database_desc.go Show resolved Hide resolved
pkg/sql/catalog/dbdesc/database_desc.go Outdated Show resolved Hide resolved
pkg/sql/catalog/dbdesc/database_test.go Show resolved Hide resolved
pkg/sql/catalog/desc_getter.go Outdated Show resolved Hide resolved
descpb.MaybeFixPrivileges(typ.GetID(), &typ.Privileges)
}
}
res, err := protoutil.Marshal(desc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cool with this.

}
backRefMutables[id] = backRefMutable
}
if hasTempBackref {
n.persistence = tree.PersistenceTemporary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the impact of this being missing?

// The prefix has the same format as the validation error wrapper.
msgPrefix := fmt.Sprintf("%s %q (%d): ", desc.TypeName(), desc.GetName(), desc.GetID())
if msg[:len(msgPrefix)] == msgPrefix {
msgPrefix = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack -- this code makes me somewhat sad. At some point hopefully it gets reworked with templates

@postamar postamar force-pushed the add-desc-validation-followup branch from d9d9a59 to d443b82 Compare February 24, 2021 17:34
@postamar
Copy link
Contributor Author

Thanks a lot for the speedy review @ajwerner. I believe I've addressed your concerns in the latest revision I pushed. I also added some unit tests in dbdesc and schemadesc which were missing.

Random thought: test coverage is always going to be spotty unless we enforce it somehow. Ideally I'd like the test suite to fail should there exist a specific validation error which doesn't have a corresponding unit test. Perhaps we could use the type system to this effect:

  • force Report to only accept a specific error subtype for validations,
  • which each need to be registered,
  • write a test which matches the contents of the registry with the unit test cases.
  • those test cases contain an expected error field, right now it's a string but it could be of a type that matches the registry entry somehow.

This validation error subtyping might have other interesting benefits like making the error message formatting more consistent, allowing for errors.Is assertions, etc.

@postamar postamar force-pushed the add-desc-validation-followup branch from d443b82 to 2eec21b Compare February 24, 2021 20:12
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation error subtyping might have other interesting benefits like making the error message formatting more consistent, allowing for errors.Is assertions, etc.

This is a fun idea. I think the way I'd do this is to have a hook in a testing knob to wrap the error reporter and then accumulate all of the errors generated through the test and then at the end of the test cross check that against a list of errors we expect to exist. We could do this leveraging types, or not. We could enforce this all further by saying that all of the error types need to be registered in some form or another at init time.

Didn't finish but here's a quick comment

Reviewed 26 of 69 files at r1, 1 of 5 files at r2, 14 of 23 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @postamar)


pkg/sql/scrub.go, line 433 at r3 (raw file):

		table, ok := desc.(catalog.TableDescriptor)
		if !ok {
			return nil, catalog.ErrDescriptorNotFound

any interest in telling us which descriptor wasn't found with an errors.Wrapf?

Scrub is sort of deprecated so not sure it matters too much.


pkg/sql/catalog/validate.go, line 139 at r3 (raw file):

Quoted 4 lines of code…
type validationErrorAccumulatorImpl struct {
	wrapPrefix string
	errors     []error
}

minor nit, make a separate type for the ValidationErrors implementation.

type validationErrors struct {
    errors []error
}

func (*validationErrors) sealed() {}

type validationErrorAccumulator struct {
    validationErrors
    wrapPrefix string
}

Also, I feel like the pattern where an unexported struct which implements an exported interface of the same name is pretty clear that it's the impl.

@postamar postamar force-pushed the add-desc-validation-followup branch from 2eec21b to 66db0fd Compare February 25, 2021 00:27
Copy link
Contributor Author

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @postamar)


pkg/sql/scrub.go, line 433 at r3 (raw file):

Previously, ajwerner wrote…

any interest in telling us which descriptor wasn't found with an errors.Wrapf?

Scrub is sort of deprecated so not sure it matters too much.

Sure! In fact I went a bit further and made use of ErrDescNotFound in ValidationDescGetter and friends.


pkg/sql/catalog/validate.go, line 139 at r3 (raw file):

Previously, ajwerner wrote…
type validationErrorAccumulatorImpl struct {
	wrapPrefix string
	errors     []error
}

minor nit, make a separate type for the ValidationErrors implementation.

type validationErrors struct {
    errors []error
}

func (*validationErrors) sealed() {}

type validationErrorAccumulator struct {
    validationErrors
    wrapPrefix string
}

Also, I feel like the pattern where an unexported struct which implements an exported interface of the same name is pretty clear that it's the impl.

Done.

@postamar postamar force-pushed the add-desc-validation-followup branch from 66db0fd to 43e6695 Compare February 25, 2021 15:55
@postamar
Copy link
Contributor Author

I rebased w/ linter error fix + a couple more validation test cases. I also added a new small commit which protects doctor validations from runtime errors like nil dereferences. I'm uncomfortable about the possibility of a customer running doctor to debug a cluster and doctor itself crashing 😬

@postamar postamar force-pushed the add-desc-validation-followup branch from 43e6695 to cc94a13 Compare February 25, 2021 20:31
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: just a couple of questions

Reviewed 14 of 69 files at r1, 2 of 5 files at r2, 4 of 23 files at r3, 23 of 23 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @postamar)


pkg/sql/catalog/dbdesc/database_desc.go, line 343 at r4 (raw file):

		report := func(err error) {
			vea.Report(errors.Wrapf(err, "schema mapping entry %q (%d)",
				errors.Safe(schemaName), errors.Safe(schemaInfo.ID)))

descpb.ID is always safe to format; no need to wrap values of that type in Safe.

func (ID) SafeValue() {}


pkg/sql/catalog/descs/collection.go, line 1453 at r4 (raw file):

}

// ValidateUncommittedDescriptors validates all uncommitted descriptors

maybe comment that this goes to the store for all of the descriptors?

When I was initially thinking about this stuff, I was thinking that we wouldn't go re-read them all but in practice this seems all fine and good given the new parallelism.


pkg/sql/catalog/tabledesc/table.go, line 179 at r4 (raw file):

}

// GetConstraintInfo returns a summary of all constraints on the table.

does it matter that now the constraint details won't have descriptors? Was anybody using that? If not, can we deprecate that field altogether and then remove it in the subsequent release?


pkg/sql/pgwire/testdata/pgtest/notice, line 58 at r4 (raw file):

CommandComplete
----
{"Severity":"NOTICE","SeverityUnlocalized":"","Code":"00000","Message":"the data for dropped indexes is reclaimed asynchronously","Detail":"","Hint":"The reclamation delay can be customized in the zone configuration for the table.","Position":0,"InternalPosition":0,"InternalQuery":"","Where":"","SchemaName":"","TableName":"","ColumnName":"","DataTypeName":"","ConstraintName":"","File":"drop_index.go","Line":519,"Routine":"dropIndexByName","UnknownFields":null}

:( every time this happens I cry a little inside

@postamar postamar force-pushed the add-desc-validation-followup branch from cc94a13 to dc62504 Compare March 1, 2021 13:57
Copy link
Contributor Author

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @miretskiy, and @postamar)


pkg/sql/catalog/dbdesc/database_desc.go, line 343 at r4 (raw file):

Previously, ajwerner wrote…

descpb.ID is always safe to format; no need to wrap values of that type in Safe.

func (ID) SafeValue() {}

Done.


pkg/sql/catalog/descs/collection.go, line 1453 at r4 (raw file):

Previously, ajwerner wrote…

maybe comment that this goes to the store for all of the descriptors?

When I was initially thinking about this stuff, I was thinking that we wouldn't go re-read them all but in practice this seems all fine and good given the new parallelism.

Done.


pkg/sql/catalog/tabledesc/table.go, line 179 at r4 (raw file):

Previously, ajwerner wrote…

does it matter that now the constraint details won't have descriptors? Was anybody using that? If not, can we deprecate that field altogether and then remove it in the subsequent release?

Those do appear to be used in information_schema.go, pg_catalog.go and such places.


pkg/sql/pgwire/testdata/pgtest/notice, line 58 at r4 (raw file):

Previously, ajwerner wrote…

:( every time this happens I cry a little inside

If I had a dollar every time this happened I'd be able to buy myself a latte every month or two!

@postamar postamar force-pushed the add-desc-validation-followup branch from dc62504 to 523ed05 Compare March 1, 2021 14:39
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 24 of 24 files at r5.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @postamar)

Marius Posta added 2 commits March 1, 2021 10:18
Previously, descriptor validation suffered from a few shortcomings:
1. The methods to call were inconsistent across descriptor types, and
   the error messages they return are inconsistent as well.
2. Those methods made ad-hoc use of catalog.DescGetter to walk through
   the descriptor reference graph, and could sometimes walk surprisingly
   far. This is particularly true for type descriptors, which hold
   references to other types.
3. To complicate things further, DescGetter implementations which read
   descriptors from storage themselves perform validation as well.
   Although it is perfectly sensible to systematically validate
   descriptors when read, there is a circularity problem at hand which
   would benefit from being solved without melting the maintainer's
   brain in the process.
4. The validation methods return an error type, making it awkward to
   return multiple errors. Tools like doctor would be made more useful
   if they could report more than only the first encountered error.

Recently we introduced a change that added descriptor validation on
write. This change involves validating all uncommitted descriptors in
bulk and this in turn favours a bulk approach to reading all their
referenced descriptors from storage.

With this in mind, this commit adds a GetReferencedDescIDs method to
catalog.Descriptor which returns the IDs of all descriptors referenced
by the receiver. By calling this method on all descriptors we intend to
validate and by reading all referenced descriptors in one kv.Batch and
stuffing them in a catalog.MapDescGetter, we can now validate using this
in-memory DescGetter instead.

Turning validation into this multi-step process means it can no longer
be invoked as a method call on the target descriptor. This commit moves
the entry point to descriptor validation to the catalog.Validate
function. The descriptor objects themselves still define the validation
checks themselves but these are effectively made inaccessible from
outside the catalog package. The rationale behind this is to enforce
order in the sequence of checks performed and to enforce some uniformity
in the formatting of the error messages. The checks are grouped into
three tiers: internal consistency checks, cross-reference consistency
checks, and transactional consistency checks.

All this also helps reduce the number of call sites of descriptor
validations as well as increase the scope of the validations. This has
uncovered a few bugs related to schemas and temporary tables. This
effort has also helped identify validation checks which have since been
made redundant, either by other existing rules or by the new
validation-on-write behaviour. Finally, this has allowed us to strip
doctor of its own duplicated (and dodgy) validation logic and simply
report all validation errors instead.

Release justification: This commit is safe for this release because it
consists of bug fixes and changes involving descriptor validation. If
anything these changes will help uncover descriptor corruptions which
were previously unnoticed, both because we're extending the validation
suite and because the doctor tool produces richer output.

Release note (cli change): The doctor tool can now report multiple
descriptor validation failures per descriptor.
Previously, doctor could crash when validating corrupted descriptors.
The validation code is usually cautious enough to not dereference nils
or the like but this isn't enforced in any way.

This commit protects the validation calls in doctor with panic recovery,
allowing doctor to continue and validate the remaining descriptors.

Release justification: non-production code change.
Release note: None
@postamar postamar force-pushed the add-desc-validation-followup branch from 523ed05 to 4d9034c Compare March 1, 2021 15:19
@postamar
Copy link
Contributor Author

postamar commented Mar 1, 2021

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 1, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants