-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
upgrade: adding is_draining to system.sql_instance can fail #135737
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we write a test that tests this upgrade with a multiregion system DB?
) error { | ||
finalDescriptor := systemschema.SQLInstancesTable() | ||
// Replace the stored descriptor with the bootstrap descriptor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this comment outdated now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
50584f1
to
c36ef72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rafiss This PR reintroduces TestMrSystemDatabaseUpgrade which does that for MR system databases. 24.3 has the same test but for some reason it didn't use Latest as the final version.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @stevendanna)
) error { | ||
finalDescriptor := systemschema.SQLInstancesTable() | ||
// Replace the stored descriptor with the bootstrap descriptor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
looks like we need to teach the migration to add the column to the correct column family. |
c36ef72
to
ae3184d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rafiss Should be fixed now, made a bad assumption that it would just pick the only family. Re-ran locally to confirm
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss, @shubhamdhama, and @stevendanna)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up!
Left some observations but no blocking changes.
mutableDesc.Version = version | ||
return txn.Descriptors().WriteDesc(ctx, false, mutableDesc, txn.KV()) | ||
_, err := txn.Exec(ctx, "add-draining-column", txn.KV(), | ||
`ALTER TABLE system.sql_instances ADD COLUMN IF NOT EXISTS is_draining BOOL NULL FAMILY "primary"`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: We have a helper function called migrateTable that does a lot of work that I sort of assume the schema changer already does for us. I suppose we only need that when we want to be idempotent based on some property other than the existence of a column.
TIL: I'm rather surprised we had to specify family here since you wouldn't need to for an ordinary table. But it looks like we have a special case in allocateColumnFamilyIDs for the system db.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Its a bit safer to use that, so I switched the code over to it. The advantage of that one is that it waits for pending schema changes too (in case something crashes).
// Disable license enforcement for this test. | ||
for _, s := range cluster.Servers { | ||
s.ExecutorConfig().(sql.ExecutorConfig).LicenseEnforcer.Disable(ctx) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: I'm guessing this might be needed outside ccl/multiregionccl
package but not here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the upgrades in 24.3 validates if a valid license is set, so this bypasses this for the upgrade testing. The other option was generating a license in the unit test, but that seems overkill.
Also, thanks for taking over this. |
Previously, the logic to add the is_draining column to the system.sql_instance descriptor would copy the bootstrap descriptor directly. While this works fine for non-multiregion system databases, this approach breaks for multi-region system databases. This is because bootstrap descriptors do not have multi-region modifications applied on top. To address this, this change modifies the upgrade to use ALTER TABLE ADD COLUMN. Fixes: cockroachdb#135736 Release note: None
ae3184d
to
0509190
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss, @shubhamdhama, and @stevendanna)
mutableDesc.Version = version | ||
return txn.Descriptors().WriteDesc(ctx, false, mutableDesc, txn.KV()) | ||
_, err := txn.Exec(ctx, "add-draining-column", txn.KV(), | ||
`ALTER TABLE system.sql_instances ADD COLUMN IF NOT EXISTS is_draining BOOL NULL FAMILY "primary"`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Its a bit safer to use that, so I switched the code over to it. The advantage of that one is that it waits for pending schema changes too (in case something crashes).
// Disable license enforcement for this test. | ||
for _, s := range cluster.Servers { | ||
s.ExecutorConfig().(sql.ExecutorConfig).LicenseEnforcer.Disable(ctx) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the upgrades in 24.3 validates if a valid license is set, so this bypasses this for the upgrade testing. The other option was generating a license in the unit test, but that seems overkill.
@rafiss @stevendanna @shubhamdhama TFTR! bors r+ |
Based on the specified backports for this PR, I applied new labels to the following linked issue(s). Please adjust the labels as needed to match the branches actually affected by the issue(s), including adding any known older branches. Issue #135736: branch-release-24.3.0-rc. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 0509190 to blathers/backport-release-24.3-135737: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 24.3.x failed. See errors above. error creating merge commit from 0509190 to blathers/backport-release-24.3.0-rc-135737: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 24.3.0-rc failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, the logic to add the is_draining column to the system.sql_instance descriptor would copy the bootstrap descriptor directly. While this works fine for non-multiregion system databases, this approach breaks for multi-region system databases. This is because bootstrap descriptors do not have multi-region modifications applied on top. To address this, this change modifies the upgrade to use ALTER TABLE ADD COLUMN.
Fixes: #135736
Release note: None