sql: Add support for drop table inside the new schema changer #65525

fqazi · 2021-05-20T18:31:23Z

sql: Initial support for DROP TABLE in new schema changer

Previously, the logic for DROP table was completely imperative,
which would make it hard to implement features like transactional
schema changes in the future. To address this, this patch
adds initial support for DROP TABLE with support for clearing
out type back references and foreign key. Basic structures for
tracking and adding secondary indexes are also introduced.

cockroach-teamcity · 2021-05-20T18:31:34Z

This change is

ajwerner

I think you picked up an extra commit by accident.

Reviewable status: complete! 0 of 0 LGTMs obtained

fqazi · 2021-05-26T19:26:13Z

Should be fixed accidentally messed up a rebase.

ajwerner

Here's a first pass, let me know what you think.

Reviewed 16 of 16 files at r1, 9 of 9 files at r2, 7 of 7 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)

pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 479 at r1 (raw file):

		fks = table.TableDesc().InboundFKs
	}
	newFks := make([]descpb.ForeignKeyConstraint, 0, len(fks))

len(fks)-1?

pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):

Quoted 10 lines of code…

// MutableDescGetter encapsulates the logic to retrieve descriptors.
// All retrieved descriptors are modified.
type MutableDescGetter interface {
	GetMutableTypeByID(ctx context.Context, id descpb.ID) (*typedesc.Mutable, error)
	GetImmutableDatabaseByID(ctx context.Context, id descpb.ID) (catalog.DatabaseDescriptor, error)
	GetMutableTableByID(ctx context.Context, id descpb.ID) (*tabledesc.Mutable, error)
	AddDrainedName(id descpb.ID, nameInfo descpb.NameInfo)
	SubmitDrainedNames(ctx context.Context, codec keys.SQLCodec, ba *kv.Batch) error
	RemoveObjectComments(ctx context.Context, id descpb.ID) error
}

This name is starting to feel pretty inappropriate. What if we have sub-interfaces:

type Catalog interface {
    DescriptorReader
    NamespaceWriter
    CommentWriter
}

pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):

	GetMutableTableByID(ctx context.Context, id descpb.ID) (*tabledesc.Mutable, error)
	AddDrainedName(id descpb.ID, nameInfo descpb.NameInfo)
	SubmitDrainedNames(ctx context.Context, codec keys.SQLCodec, ba *kv.Batch) error

In retrospect I feel pretty bad about exposing *kv or codec here. This comment doesn't need to be dealt with here but I think we're walking down a very bad path. The hope was that all of the dependencies of scexec could be mocked. I think that's a really important principle. Today some of the catalog stuff makes that somewhat hard. What I do feel strongly about is that we try to keep the dependencies inside of its subpackages narrow. Importantly, I never want us to be talking about kv construct. We're going to at some point want to be able to run this entire subsystem based on descriptors that just sit in memory.

I'm actively working on knocking the *kv.Txn out of the various descriptor retrieval interfaces which should then make it straightforward to knock that out of the scexec package.

pkg/sql/schemachanger/scop/mutation.go, line 103 at r2 (raw file):

	TableID     descpb.ID
	ColumnID    descpb.ColumnID
	SequenceIDs []descpb.ID

I'd prefer if this were a separate op to remove the back-reference. I'd go so far as to ask whether it's possible to make every op act on at most one descriptor. I would like that very much.

I'll even go so far as to say that the descriptor retrieval we do at the top of each op is an eye-sore and my hope here has been to have the layer above actually fetch the descriptor for a given op.

pkg/sql/schemachanger/scpb/elements.go, line 246 at r1 (raw file):

// DescriptorSet implements the Element interface.
func (e *ForeignKey) DescriptorSet() []descpb.ID { return []descpb.ID{e.OriginID, e.ReferenceID} }

Rather than this, how do you feel about making a ForeignKeyBackreference an element on its own? Or, perhaps, OutboundForeignKey and InboundForeignKey? They should always end up being paired but each one is attached to just one descriptor. My sense is that keeping a 1:1 mapping between elements and descriptor IDs is valuable, so if we can avoid breaking that, I'd prefer it.

Perhaps we should put up some declarative correctness rules like if you've got one OutboundForeignKey then you should have a parallel InboundForeignKey in the same direction. FWIW I do not thing they have the same lifecycle in terms of operations, so this decomposition is likely to be nice for planning purposes.

pkg/sql/schemachanger/scpb/scpb.proto, line 118 at r2 (raw file):

  option (gogoproto.equal) = true;
  uint32 table_id = 1 [(gogoproto.customname) = "TableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"];
  uint32 owner_table_id = 2  [(gogoproto.customname) = "OwnerTableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"];

Would it be wild to have SequenceOwnership as an element?
As it stands, it's not obvious to me that. We'll be locking the table that's being upgraded in the back-reference. I think that's interesting and maybe fine. This brings up the whole "locking" thing that we've not formally specified yet. nx

fqazi

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 479 at r1 (raw file):

Previously, ajwerner wrote…

len(fks)-1?

Done.

pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):

Previously, ajwerner wrote…

// MutableDescGetter encapsulates the logic to retrieve descriptors.
// All retrieved descriptors are modified.
type MutableDescGetter interface {
	GetMutableTypeByID(ctx context.Context, id descpb.ID) (*typedesc.Mutable, error)
	GetImmutableDatabaseByID(ctx context.Context, id descpb.ID) (catalog.DatabaseDescriptor, error)
	GetMutableTableByID(ctx context.Context, id descpb.ID) (*tabledesc.Mutable, error)
	AddDrainedName(id descpb.ID, nameInfo descpb.NameInfo)
	SubmitDrainedNames(ctx context.Context, codec keys.SQLCodec, ba *kv.Batch) error
	RemoveObjectComments(ctx context.Context, id descpb.ID) error
}

This name is starting to feel pretty inappropriate. What if we have sub-interfaces:

type Catalog interface {
    DescriptorReader
    NamespaceWriter
    CommentWriter
}

Renamed this, will rework the interfaces later.

pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):

Previously, ajwerner wrote…

In retrospect I feel pretty bad about exposing *kv or codec here. This comment doesn't need to be dealt with here but I think we're walking down a very bad path. The hope was that all of the dependencies of scexec could be mocked. I think that's a really important principle. Today some of the catalog stuff makes that somewhat hard. What I do feel strongly about is that we try to keep the dependencies inside of its subpackages narrow. Importantly, I never want us to be talking about kv construct. We're going to at some point want to be able to run this entire subsystem based on descriptors that just sit in memory.

I'm actively working on knocking the *kv.Txn out of the various descriptor retrieval interfaces which should then make it straightforward to knock that out of the scexec package.

Currently there isn't a clean path here. Maybe this should be popped out of the interface side, and let submit be on the implementation only?

pkg/sql/schemachanger/scpb/scpb.proto, line 118 at r2 (raw file):

Previously, ajwerner wrote…

Would it be wild to have SequenceOwnership as an element?
As it stands, it's not obvious to me that. We'll be locking the table that's being upgraded in the back-reference. I think that's interesting and maybe fine. This brings up the whole "locking" thing that we've not formally specified yet. nx

Added one the way I have it setup is that drop SequenceOwner -> Sequence (DROPPING) being dropped and Table -> SequenceOwner. For alters, we will need to think a bit about how to handle that better.

Previously, the logic for DROP table was completely imperative, which would make it hard to implement features like transactional schema changes in the future. To address this, this patch adds initial support for DROP TABLE with support for clearing out type back references and foreign key. Basic structures for tracking and adding secondary indexes are also introduced. Release note: None

Previously, the drop table in the new schema change left behind comments for objects. This was inadequate because, entries would be left behind in the comments table. These changes enhance the infrastructure to add an internal executor and clean up logic at the same time. Release note: None

Previously, the implementation of drop table inside the new schema changer did not check for cascade flags or if the required authorization was there. This was inadequate because, only cascaded drops were supported, these changes added the appropriate checks. Release note: None

Previously, the new schema change didn't implement the code paths for dropping interleaved tables. This was inadequate because the existing logic does not deal with the them properly. To address this, this patch disables the new schema changer for interleaved tables. Release note: None

Previously, we had a monolithic MutDescGetter interface, which exposed different operations outside of jut descriptor ones. This was inadequate because in the longer term it would turn into a maintain once nightmare. This patch turns it into Catalog interface with child operations. Release note: None

ajwerner

Looking good! Just minor things

Reviewed 5 of 12 files at r5, 1 of 2 files at r9, 2 of 7 files at r11, 1 of 4 files at r13.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)

pkg/sql/schemachanger/scbuild/sequence.go, line 28 at r14 (raw file):

// dropSequenceDesc builds targets and transformations using a descriptor.
func (b *buildContext) dropSequenceDesc(
	ctx context.Context, table catalog.TableDescriptor, cascade tree.DropBehavior,

It's confusing to me that this is called table. I know it is a TableDescriptor, but that feels historical. Can we call the variable sequence or seq to clarify that it's the sequence being dropped and not some table involved in a back-reference?

pkg/sql/schemachanger/scbuild/sequence.go, line 70 at r14 (raw file):

	sequenceOwnedBy := &scpb.SequenceOwnedBy{
		TableID:      table.GetID(),
		OwnerTableID: table.GetSequenceOpts().SequenceOwner.OwnerTableID}

This should only be set if the sequence owner is non-zero, right?

pkg/sql/schemachanger/scbuild/table.go, line 579 at r14 (raw file):

func (b *buildContext) dropTable(ctx context.Context, n *tree.DropTable) {
	// Find the table first.
	for _, name := range n.Names {

How would you feel about making a helper for the body of this loop. It's a lot of code to be indented and I find myself forgetting I'm in a loop.

More generally, I could see this logic chunked up a bit more into functions. It's hard to see the big picture when reading the code.

pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

Currently there isn't a clean path here. Maybe this should be popped out of the interface side, and let submit be on the implementation only?

Do we call SubmitDrainedNames or SubmitAllJobs in this package? Seems like it gets called only in scexec. I think we can just remove those methods from this interface.

pkg/sql/schemachanger/scpb/scpb.proto, line 169 at r14 (raw file):

message SequenceOwnedBy {
  uint32 table_id = 1 [(gogoproto.customname) = "TableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"];

nit: sequence_id? I feel like the fact that it's a table is a crufty implementation detail.

ajwerner

Reviewed 2 of 16 files at r4, 2 of 4 files at r13, 6 of 9 files at r15.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)

pkg/sql/schemachanger/scbuild/testdata/drop_table, line 41 at r15 (raw file):

DROP TABLE defaultdb.shipments CASCADE;
----
- target:

These are hard to read. Let's do something about that (later).

Previous, drop sequence inside the new schema changer used to refer to descriptor ID's as table_id's when it would be cleaner to refer to sequence ID. Also clean up the code for drop table at the same time. This isn't a functional change and only cleans up the code to be more readable. These changes also removed things from schema change execution interfaces that shouldn't be shared across boundaries for schema change execution. Release note: None.

fqazi · 2021-06-07T20:29:01Z

bors r=ajwerner

craig · 2021-06-07T21:27:01Z

Build succeeded:

GitHub CI (Cockroach)

fqazi requested a review from a team May 20, 2021 18:31

fqazi changed the title ~~sql: Add support for drop table inside the new schema changer~~ sql: Add support for drop table inside the new schema changer (NOT READY FOR REVIEW) May 20, 2021

fqazi force-pushed the dropTableStart branch from 1d26448 to b63a9a4 Compare May 20, 2021 21:01

fqazi changed the title ~~sql: Add support for drop table inside the new schema changer (NOT READY FOR REVIEW)~~ sql: Add support for drop table inside the new schema changer May 20, 2021

fqazi marked this pull request as ready for review May 20, 2021 21:34

fqazi force-pushed the dropTableStart branch 5 times, most recently from 2d05121 to b9e44ed Compare May 26, 2021 18:25

ajwerner reviewed May 26, 2021

View reviewed changes

fqazi force-pushed the dropTableStart branch from b9e44ed to f396562 Compare May 26, 2021 19:25

fqazi requested a review from ajwerner May 26, 2021 19:25

fqazi force-pushed the dropTableStart branch 15 times, most recently from 2697f3f to 32276ab Compare June 2, 2021 12:49

fqazi force-pushed the dropTableStart branch 3 times, most recently from 1cb99e2 to 72c7d36 Compare June 3, 2021 15:07

ajwerner reviewed Jun 3, 2021

View reviewed changes

fqazi force-pushed the dropTableStart branch from 72c7d36 to 9d200e5 Compare June 3, 2021 17:47

fqazi requested a review from ajwerner June 3, 2021 17:49

fqazi commented Jun 3, 2021

View reviewed changes

fqazi added 5 commits June 3, 2021 20:44

fqazi force-pushed the dropTableStart branch from 9d200e5 to f5e633f Compare June 4, 2021 00:44

ajwerner reviewed Jun 7, 2021

View reviewed changes

fqazi requested a review from ajwerner June 7, 2021 19:00

ajwerner approved these changes Jun 7, 2021

View reviewed changes

fqazi force-pushed the dropTableStart branch from 4515923 to 33f36de Compare June 7, 2021 20:26

craig bot merged commit d775083 into cockroachdb:master Jun 7, 2021

fqazi deleted the dropTableStart branch June 8, 2021 13:44

fqazi mentioned this pull request Jun 8, 2021

sql: implement DROP TABLE in new schema changer #64680

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: Add support for drop table inside the new schema changer #65525

sql: Add support for drop table inside the new schema changer #65525

fqazi commented May 20, 2021

cockroach-teamcity commented May 20, 2021

ajwerner left a comment

fqazi commented May 26, 2021 •

edited

Loading

ajwerner left a comment

fqazi left a comment

ajwerner left a comment

ajwerner left a comment

fqazi commented Jun 7, 2021

craig bot commented Jun 7, 2021

sql: Add support for drop table inside the new schema changer #65525

sql: Add support for drop table inside the new schema changer #65525

Conversation

fqazi commented May 20, 2021

cockroach-teamcity commented May 20, 2021

ajwerner left a comment

Choose a reason for hiding this comment

fqazi commented May 26, 2021 • edited Loading

ajwerner left a comment

Choose a reason for hiding this comment

fqazi left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

fqazi commented Jun 7, 2021

craig bot commented Jun 7, 2021

fqazi commented May 26, 2021 •

edited

Loading