Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.1: combined fixes for ownership related bugs #51629

Conversation

arulajmani
Copy link
Collaborator

Backport:

Please see individual PRs for details.

/cc @cockroachdb/release

Previously, when a sequence that was owned by a column was being
dropped, we would not remove the sequence ID from the column descriptor
of the column that owned it. As a result, there was a bug where if the
sequence was dropped manually before the table, it would be impossible
to drop the table.

This patch addresses this problem by removing the ownership dependency
on sequence drops.

Fixes cockroachdb#50649

Release note (bug fix): there was a bug previously where if a user
created a sequence owned by a table's column and dropped the sequence,
it would become impossible to drop the table after. This is now fixed.
See attached issue for repro steps.
Previously, there was a bug in the `ALTER SEQUENCE ... OWNED BY ...`
command that could cause sequence IDs to appear on more than one
column descriptor's 'OwnsSequenceIDs' field. This was problematic as
the dropping logic requires a sequence ID only be present on one column
descriptor's `OwnsSequenceIDs' field.

The field would get corrupted whenever ownership was altered between
columns of the same table. This was because we would read a table
descriptor before executing the removal code, but the removal would
modify the same table descriptor. This would make the table descriptor
read earlier stale, as the old owner's column descriptor still
referenced the sequence ID. This would mean that when a new owner was
added to a different column, the schema change that was written would
have the sequence ID on two different column descriptor's
`OwnsSequenceIDs` field.

This patch fixes that by forcing the `addSequenceOwner` to resolve the
most recent version of the table descriptor instead of being passed it.

Fixes cockroachdb#50711

Release note (bug fix): Previously, if there was a table
`t(a int, b int)`, and a sequence `seq` that was first owned by `t.a`
and then altered to be owned by `t.b`, it would make the table `t`
impossible to drop. This is now fixed.
Previously, `DROP DATABASE CASCADE` would not work if the database
contained a sequence owned by a table in the database. This would
happen because of two separate reasons:
- If the sequence was dropped before the table, the table would try to
"double drop" the sequence as it owned it; this would result in an
error.
- If the table was dropped before the sequence, the sequence would try
to remove the ownership dependency from the table descriptor, which had
already been dropped; this would also result in an error.

This PR addresses both these issues separately. Sequences are no longer
double dropped when dropping tables. Additionally, no attempt is made
to remove the ownership dependency from the table descriptor if the
table descriptor has already been dropped.

Fixes cockroachdb#50712

Release note (bug fix): `DROP DATABASE CASCADE` now works as expected
even when the database has a sequence with an owner in it.
Previously, we would restore sequence descriptors/table descriptors as
is, without remapping the ownership dependency to the new IDs. This
meant sequence ownership was broken post restore. This PR fixes that.

When sequences with owners or tables that have columns that own
sequences are restored:
- If both the owned sequence and the owning table are being restored,
the ownership is remapped.
- If just the sequence is restored, the user receives an error to use
the `skip_missing_sequence_owners` flag. If the flag is provided, the
sequence is restored without any owner.
- If just the table is restored, the user receives an error to use the
`skip_missing_sequence_owners` flag. If the flag is provided, the table
is restored with the table column that previously owned the sequence no
longer owning that sequence.

Fixes cockroachdb#50781

Release note (enterprise change): Restore now has a new option
`skip_missing_sequence_owners` that must be supplied when restoring
only the table/sequence that was previously
a sequence owner/owned by a table.
Additionally, this fixes a bug where ownership relationships would not
be remapped after a restore.
@arulajmani arulajmani requested a review from solongordon July 21, 2020 00:36
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@arulajmani arulajmani force-pushed the backport20.1-50665-50720-50744-50744-50949-51253 branch 5 times, most recently from 4ea2c96 to fa306ec Compare July 23, 2020 19:31
Previously, when dropping a sequence or a table that had an ownership
relationship, we would lookup corresponding table descriptors to unlink
the relationship. In the case of tables, the owned sequence needed to
be dropped as well, so we would lookup the sequence descriptor. If the
corresponding descriptor was not found/had already been dropped, it
would result in an error -- thereby making it impossible to drop the
object.

This wasn't an issue, because you don't expect descriptors to be
dropped/not found if an ownership relationship still exists. However,
this integrity constraint was violated by a couple of sequence
ownership bugs. See cockroachdb#51170 for more details.

It should be possible to drop tables/sequences that have descriptors
in such invalid state. This PR adds support for this by swallowing
specific errors that users may find themselves in due to invalid
descriptors. It also adds tests to simulate these invalid states users
may find themselves in.

closes cockroachdb#51170

Release note (bug fix): Previously users who found themselves with
descriptors in an invalid state due to the ownership issues linked in
that contained them. This is now fixed.
@arulajmani arulajmani force-pushed the backport20.1-50665-50720-50744-50744-50949-51253 branch from fa306ec to 996de21 Compare July 23, 2020 19:32
@@ -1170,6 +1170,7 @@ DROP DATABASE db_50712 CASCADE

statement ok
DROP TABLE t_50712
<<<<<<< HEAD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Ideally you'd remove this line in this commit rather than a later commit in the PR.

@@ -159,6 +160,13 @@ func (p *planner) canRemoveOwnedSequencesImpl(
for _, sequenceID := range col.OwnsSequenceIds {
seqLookup, err := p.LookupTableByID(ctx, sequenceID)
if err != nil {
// Special case error swallowing for #50711 and #50781, which can cause a
// column to own sequences that have been dropped/do not exist.
if err.Error() == errTableDropped.Error() ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why did you stop using errors.Is here?

@@ -1084,3 +1084,135 @@ DROP SEQUENCE seq_50649

statement ok
DROP TABLE t_50649

subtest regression_50712
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of confusing how tests from a different commit ended up in this one, but I don't know if it's worth the trouble to fix.

Copy link
Collaborator Author

@arulajmani arulajmani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani and @solongordon)


pkg/sql/drop_sequence.go, line 165 at r5 (raw file):

Previously, solongordon (Solon) wrote…

Nit: Why did you stop using errors.Is here?

errDropTables are wrapped as inactiveTableErrors. On master, inactiveTableError has Unwrap defined on it, which errors.Is uses underneath the hood to perform its checks. We don't have this on the release branch, so errors.Iswas failing here. AddingUnwrapwasn't as straight forward because of the other places that usedinactiveTableErrors, so I opted to switch to .Erorr()` here to minimize the changes required.


pkg/sql/logictest/testdata/logic_test/sequences, line 1088 at r2 (raw file):

Previously, solongordon (Solon) wrote…

Kind of confusing how tests from a different commit ended up in this one, but I don't know if it's worth the trouble to fix.

Yeah, I'm kinda sad about some tests from different commits. IIRC after the backport command finished, there were still some changes left to stage. I realized it at a point where it became really hard to track down individual changes and attribute them to their commits. I'm honestly not sure why this happened :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants