-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Add support for drop table inside the new schema changer #65525
Conversation
2d05121
to
b9e44ed
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you picked up an extra commit by accident.
Reviewable status: complete! 0 of 0 LGTMs obtained
Should be fixed accidentally messed up a rebase. |
2697f3f
to
32276ab
Compare
1cb99e2
to
72c7d36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a first pass, let me know what you think.
Reviewed 16 of 16 files at r1, 9 of 9 files at r2, 7 of 7 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)
pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 479 at r1 (raw file):
fks = table.TableDesc().InboundFKs } newFks := make([]descpb.ForeignKeyConstraint, 0, len(fks))
len(fks)-1
?
pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):
Quoted 10 lines of code…
// MutableDescGetter encapsulates the logic to retrieve descriptors. // All retrieved descriptors are modified. type MutableDescGetter interface { GetMutableTypeByID(ctx context.Context, id descpb.ID) (*typedesc.Mutable, error) GetImmutableDatabaseByID(ctx context.Context, id descpb.ID) (catalog.DatabaseDescriptor, error) GetMutableTableByID(ctx context.Context, id descpb.ID) (*tabledesc.Mutable, error) AddDrainedName(id descpb.ID, nameInfo descpb.NameInfo) SubmitDrainedNames(ctx context.Context, codec keys.SQLCodec, ba *kv.Batch) error RemoveObjectComments(ctx context.Context, id descpb.ID) error }
This name is starting to feel pretty inappropriate. What if we have sub-interfaces:
type Catalog interface {
DescriptorReader
NamespaceWriter
CommentWriter
}
pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):
GetMutableTableByID(ctx context.Context, id descpb.ID) (*tabledesc.Mutable, error) AddDrainedName(id descpb.ID, nameInfo descpb.NameInfo) SubmitDrainedNames(ctx context.Context, codec keys.SQLCodec, ba *kv.Batch) error
In retrospect I feel pretty bad about exposing *kv
or codec
here. This comment doesn't need to be dealt with here but I think we're walking down a very bad path. The hope was that all of the dependencies of scexec
could be mocked. I think that's a really important principle. Today some of the catalog stuff makes that somewhat hard. What I do feel strongly about is that we try to keep the dependencies inside of its subpackages narrow. Importantly, I never want us to be talking about kv
construct. We're going to at some point want to be able to run this entire subsystem based on descriptors that just sit in memory.
I'm actively working on knocking the *kv.Txn
out of the various descriptor retrieval interfaces which should then make it straightforward to knock that out of the scexec
package.
pkg/sql/schemachanger/scop/mutation.go, line 103 at r2 (raw file):
TableID descpb.ID ColumnID descpb.ColumnID SequenceIDs []descpb.ID
I'd prefer if this were a separate op to remove the back-reference. I'd go so far as to ask whether it's possible to make every op act on at most one descriptor. I would like that very much.
I'll even go so far as to say that the descriptor retrieval we do at the top of each op is an eye-sore and my hope here has been to have the layer above actually fetch the descriptor for a given op.
pkg/sql/schemachanger/scpb/elements.go, line 246 at r1 (raw file):
// DescriptorSet implements the Element interface. func (e *ForeignKey) DescriptorSet() []descpb.ID { return []descpb.ID{e.OriginID, e.ReferenceID} }
Rather than this, how do you feel about making a ForeignKeyBackreference
an element on its own? Or, perhaps, OutboundForeignKey
and InboundForeignKey
? They should always end up being paired but each one is attached to just one descriptor. My sense is that keeping a 1:1 mapping between elements and descriptor IDs is valuable, so if we can avoid breaking that, I'd prefer it.
Perhaps we should put up some declarative correctness rules like if you've got one OutboundForeignKey
then you should have a parallel InboundForeignKey
in the same direction. FWIW I do not thing they have the same lifecycle in terms of operations, so this decomposition is likely to be nice for planning purposes.
pkg/sql/schemachanger/scpb/scpb.proto, line 118 at r2 (raw file):
option (gogoproto.equal) = true; uint32 table_id = 1 [(gogoproto.customname) = "TableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"]; uint32 owner_table_id = 2 [(gogoproto.customname) = "OwnerTableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"];
Would it be wild to have SequenceOwnership
as an element?
As it stands, it's not obvious to me that. We'll be locking the table that's being upgraded in the back-reference. I think that's interesting and maybe fine. This brings up the whole "locking" thing that we've not formally specified yet. nx
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 479 at r1 (raw file):
Previously, ajwerner wrote…
len(fks)-1
?
Done.
pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):
Previously, ajwerner wrote…
// MutableDescGetter encapsulates the logic to retrieve descriptors. // All retrieved descriptors are modified. type MutableDescGetter interface { GetMutableTypeByID(ctx context.Context, id descpb.ID) (*typedesc.Mutable, error) GetImmutableDatabaseByID(ctx context.Context, id descpb.ID) (catalog.DatabaseDescriptor, error) GetMutableTableByID(ctx context.Context, id descpb.ID) (*tabledesc.Mutable, error) AddDrainedName(id descpb.ID, nameInfo descpb.NameInfo) SubmitDrainedNames(ctx context.Context, codec keys.SQLCodec, ba *kv.Batch) error RemoveObjectComments(ctx context.Context, id descpb.ID) error }
This name is starting to feel pretty inappropriate. What if we have sub-interfaces:
type Catalog interface { DescriptorReader NamespaceWriter CommentWriter }
Renamed this, will rework the interfaces later.
pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):
Previously, ajwerner wrote…
In retrospect I feel pretty bad about exposing
*kv
orcodec
here. This comment doesn't need to be dealt with here but I think we're walking down a very bad path. The hope was that all of the dependencies ofscexec
could be mocked. I think that's a really important principle. Today some of the catalog stuff makes that somewhat hard. What I do feel strongly about is that we try to keep the dependencies inside of its subpackages narrow. Importantly, I never want us to be talking aboutkv
construct. We're going to at some point want to be able to run this entire subsystem based on descriptors that just sit in memory.I'm actively working on knocking the
*kv.Txn
out of the various descriptor retrieval interfaces which should then make it straightforward to knock that out of thescexec
package.
Currently there isn't a clean path here. Maybe this should be popped out of the interface side, and let submit be on the implementation only?
pkg/sql/schemachanger/scpb/scpb.proto, line 118 at r2 (raw file):
Previously, ajwerner wrote…
Would it be wild to have
SequenceOwnership
as an element?
As it stands, it's not obvious to me that. We'll be locking the table that's being upgraded in the back-reference. I think that's interesting and maybe fine. This brings up the whole "locking" thing that we've not formally specified yet. nx
Added one the way I have it setup is that drop SequenceOwner -> Sequence (DROPPING) being dropped and Table -> SequenceOwner. For alters, we will need to think a bit about how to handle that better.
Previously, the logic for DROP table was completely imperative, which would make it hard to implement features like transactional schema changes in the future. To address this, this patch adds initial support for DROP TABLE with support for clearing out type back references and foreign key. Basic structures for tracking and adding secondary indexes are also introduced. Release note: None
Previously, the drop table in the new schema change left behind comments for objects. This was inadequate because, entries would be left behind in the comments table. These changes enhance the infrastructure to add an internal executor and clean up logic at the same time. Release note: None
Previously, the implementation of drop table inside the new schema changer did not check for cascade flags or if the required authorization was there. This was inadequate because, only cascaded drops were supported, these changes added the appropriate checks. Release note: None
Previously, the new schema change didn't implement the code paths for dropping interleaved tables. This was inadequate because the existing logic does not deal with the them properly. To address this, this patch disables the new schema changer for interleaved tables. Release note: None
Previously, we had a monolithic MutDescGetter interface, which exposed different operations outside of jut descriptor ones. This was inadequate because in the longer term it would turn into a maintain once nightmare. This patch turns it into Catalog interface with child operations. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! Just minor things
Reviewed 5 of 12 files at r5, 1 of 2 files at r9, 2 of 7 files at r11, 1 of 4 files at r13.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)
pkg/sql/schemachanger/scbuild/sequence.go, line 28 at r14 (raw file):
// dropSequenceDesc builds targets and transformations using a descriptor. func (b *buildContext) dropSequenceDesc( ctx context.Context, table catalog.TableDescriptor, cascade tree.DropBehavior,
It's confusing to me that this is called table
. I know it is a TableDescriptor
, but that feels historical. Can we call the variable sequence
or seq
to clarify that it's the sequence being dropped and not some table involved in a back-reference?
pkg/sql/schemachanger/scbuild/sequence.go, line 70 at r14 (raw file):
sequenceOwnedBy := &scpb.SequenceOwnedBy{ TableID: table.GetID(), OwnerTableID: table.GetSequenceOpts().SequenceOwner.OwnerTableID}
This should only be set if the sequence owner is non-zero, right?
pkg/sql/schemachanger/scbuild/table.go, line 579 at r14 (raw file):
func (b *buildContext) dropTable(ctx context.Context, n *tree.DropTable) { // Find the table first. for _, name := range n.Names {
How would you feel about making a helper for the body of this loop. It's a lot of code to be indented and I find myself forgetting I'm in a loop.
More generally, I could see this logic chunked up a bit more into functions. It's hard to see the big picture when reading the code.
pkg/sql/schemachanger/scexec/scmutationexec/scmutationexec.go, line 41 at r3 (raw file):
Previously, fqazi (Faizan Qazi) wrote…
Currently there isn't a clean path here. Maybe this should be popped out of the interface side, and let submit be on the implementation only?
Do we call SubmitDrainedNames
or SubmitAllJobs
in this package? Seems like it gets called only in scexec
. I think we can just remove those methods from this interface.
pkg/sql/schemachanger/scpb/scpb.proto, line 169 at r14 (raw file):
message SequenceOwnedBy { uint32 table_id = 1 [(gogoproto.customname) = "TableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"];
nit: sequence_id
? I feel like the fact that it's a table is a crufty implementation detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 16 files at r4, 2 of 4 files at r13, 6 of 9 files at r15.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)
pkg/sql/schemachanger/scbuild/testdata/drop_table, line 41 at r15 (raw file):
DROP TABLE defaultdb.shipments CASCADE; ---- - target:
These are hard to read. Let's do something about that (later).
Previous, drop sequence inside the new schema changer used to refer to descriptor ID's as table_id's when it would be cleaner to refer to sequence ID. Also clean up the code for drop table at the same time. This isn't a functional change and only cleans up the code to be more readable. These changes also removed things from schema change execution interfaces that shouldn't be shared across boundaries for schema change execution. Release note: None.
bors r=ajwerner |
Build succeeded: |
sql: Initial support for DROP TABLE in new schema changer
Previously, the logic for DROP table was completely imperative,
which would make it hard to implement features like transactional
schema changes in the future. To address this, this patch
adds initial support for DROP TABLE with support for clearing
out type back references and foreign key. Basic structures for
tracking and adding secondary indexes are also introduced.