-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: public schema long running migration #72662
sql: public schema long running migration #72662
Conversation
9e876fa
to
0022cbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this! I have a few comments, nothing major I don't think.
0022cbc
to
90ac8dc
Compare
This looks good to me, good work! I'm hoping @ajwerner will also do a review pass, migrations are quite sensitive and are worth being careful about. |
a129835
to
c056510
Compare
Gonna add one more test using a backup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm generally good on this. Just minor things. This is a classic example of something that'd be nice to exercise against real store dirs from the previous version.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @RichardJCai)
pkg/migration/migrations/public_schema_migration.go, line 72 at r8 (raw file):
for _, dbID := range databaseIDs { fmt.Println("dbID:", dbID)
detritus
pkg/migration/migrations/public_schema_migration.go, line 86 at r8 (raw file):
return d.CollectionFactory.Txn(ctx, d.InternalExecutor, d.DB, func( ctx context.Context, txn *kv.Txn, descriptors *descs.Collection, ) error {
nit: buy yourself an extra layer of indentation by making the body of this a function all on its own that maybe takes just the codec, settings, txn, collection, dbID
? Generally good practice to scope down your deps when it's easy.
pkg/migration/migrations/public_schema_migration.go, line 129 at r8 (raw file):
if desc.DatabaseDesc().Schemas == nil { dbDesc.Schemas = map[string]descpb.DatabaseDescriptor_SchemaInfo{
while you're here, thoughts on just adding a method to the dbdesc.Mutable
which will construct the map if it doesn't exist and then add the entry? Seems bad that we have to do this.
pkg/migration/migrations/public_schema_migration.go, line 190 at r8 (raw file):
} for _, modified := range modifiedDescs { err := descriptors.WriteDescToBatch(
I'm not sure how real of a concern this is, but it seems to me like we could end up with a large batch here. It's unlikely to be larger than the maximum size of 16 MiB because, we'll generally folks don't have hundreds of thousands of descriptors in their database, but, it's still a concern that would suck to run into. Thoughts on passing in the *Txn
and then also having this return a batch, and then, when the batch reaches some size, send it and create a new one. You can test this by lowering the KV max command size to something quite small and then making some descriptors with some biggish strings. This is definitely an edge case concern.
We got another small issue I think, restoring a database doesn't actually add an entry for the public schema in the namespace table. Notice theres no Output:
cc @adityamaru |
Oof, bet that's going to lead to some odd bugs. |
I think we should make a migration for this as well? |
Definitely. |
Created an issue for the missing |
762b901
to
6ae9bda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/migration/migrations/public_schema_migration.go, line 136 at r5 (raw file):
Previously, postamar (Marius Posta) wrote…
nit: because
dbDesc
is mutable, you can actually dodbDesc.Schemas = ...
. If you'd rather not access the embedded struct's field directly,dbDesc.DatabaseDescriptor().Schemas = ...
is the way to go.
Done
pkg/migration/migrations/public_schema_migration.go, line 165 at r5 (raw file):
Previously, postamar (Marius Posta) wrote…
Yes this is what I meant, sorry for not being clear.
Done
pkg/migration/migrations/public_schema_migration.go, line 72 at r8 (raw file):
Previously, ajwerner wrote…
detritus
Done
pkg/migration/migrations/public_schema_migration.go, line 86 at r8 (raw file):
Previously, ajwerner wrote…
return d.CollectionFactory.Txn(ctx, d.InternalExecutor, d.DB, func( ctx context.Context, txn *kv.Txn, descriptors *descs.Collection, ) error {
nit: buy yourself an extra layer of indentation by making the body of this a function all on its own that maybe takes just the
codec, settings, txn, collection, dbID
? Generally good practice to scope down your deps when it's easy.
Done.
pkg/migration/migrations/public_schema_migration.go, line 129 at r8 (raw file):
Previously, ajwerner wrote…
if desc.DatabaseDesc().Schemas == nil { dbDesc.Schemas = map[string]descpb.DatabaseDescriptor_SchemaInfo{
while you're here, thoughts on just adding a method to the
dbdesc.Mutable
which will construct the map if it doesn't exist and then add the entry? Seems bad that we have to do this.
Did this in a separate PR since we have other places where we do this
#73942
pkg/migration/migrations/public_schema_migration.go, line 190 at r8 (raw file):
Previously, ajwerner wrote…
I'm not sure how real of a concern this is, but it seems to me like we could end up with a large batch here. It's unlikely to be larger than the maximum size of 16 MiB because, we'll generally folks don't have hundreds of thousands of descriptors in their database, but, it's still a concern that would suck to run into. Thoughts on passing in the
*Txn
and then also having this return a batch, and then, when the batch reaches some size, send it and create a new one. You can test this by lowering the KV max command size to something quite small and then making some descriptors with some biggish strings. This is definitely an edge case concern.
Addressed all but this one so far, trying to figure out exactly how to do this.
6ae9bda
to
bcce7d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/migration/migrations/public_schema_migration.go, line 190 at r8 (raw file):
Previously, RichardJCai (Richard Cai) wrote…
Addressed all but this one so far, trying to figure out exactly how to do this.
Added batching PTAL.
Release note: None
bcce7d5
to
2d93832
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one minor comment. Looking forward to stamping this. Good work!
Reviewed 210 of 213 files at r5, 4 of 4 files at r6, 1 of 3 files at r9, 4 of 4 files at r10, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @RichardJCai)
pkg/migration/migrations/migrations.go, line 62 at r10 (raw file):
), migration.NewTenantMigration( "seed system.span_configurations with configs for existing for existing tenants",
s/for existing for existing tenants/for existing tenants/
pkg/migration/migrations/public_schema_migration.go, line 101 at r10 (raw file):
if !found { return errors.Newf("expected to find database with id %d", dbID) }
Piggybacking on Andrew's comment: you don't need to check found
if you pass the Required
flag to the descriptors
API call above. Use tree.DatabaseLookupFlags{Required: true}
and if it's not found, err
will be set appropriately, meaning you can ignore found
. This will also bypass leasing and all that, so it's what you want to do here.
pkg/migration/migrations/public_schema_migration.go, line 175 at r10 (raw file):
batch := txn.NewBatch() for _, desc := range allDescriptors { b := desc.NewBuilder()
You'll want to move this further down, below the if-continue block which will eliminate most descriptors from further consideration. Under the hood, creating a descriptor builder does a protoutil.Clone
, which isn't cheap. We've ran into perf issues from this kind of thing before, which is why I'm being careful.
20171b1
to
634bc89
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, also made a minor change such that we don't run the test that creates 500 tables under race.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @postamar)
pkg/migration/migrations/migrations.go, line 62 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
s/for existing for existing tenants/for existing tenants/
Done.
pkg/migration/migrations/public_schema_migration.go, line 101 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
Piggybacking on Andrew's comment: you don't need to check
found
if you pass theRequired
flag to thedescriptors
API call above. Usetree.DatabaseLookupFlags{Required: true}
and if it's not found,err
will be set appropriately, meaning you can ignorefound
. This will also bypass leasing and all that, so it's what you want to do here.
Done.
pkg/migration/migrations/public_schema_migration.go, line 175 at r10 (raw file):
Previously, postamar (Marius Posta) wrote…
You'll want to move this further down, below the if-continue block which will eliminate most descriptors from further consideration. Under the hood, creating a descriptor builder does a
protoutil.Clone
, which isn't cheap. We've ran into perf issues from this kind of thing before, which is why I'm being careful.
Thanks good call
When restoring a database, a namespace entry for the public schema was not created. Release note: None
634bc89
to
f60150d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r11, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)
Thanks for the reviews 😃 bors r=postamar |
Build succeeded: |
sql: public schema long running migration
Release note: None