Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: support adding virtual columns #59401

Merged
merged 1 commit into from
Jan 28, 2021
Merged

Conversation

RaduBerinde
Copy link
Member

Support adding and removing virtual columns (when no indexes on those
columns are in play).

Note that because these columns don't need backfill in existing
indexes, they don't need special WriteOnly / DeleteOnly logic (other
than having the right visibility).

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that because these columns don't need backfill in existing
indexes, they don't need special WriteOnly / DeleteOnly logic (other
than having the right visibility).

This note is interesting in that nevertheless we do move these virtual columns through those states. As we've discussed elsewhere, this will be handy for cases where virtual computed columns are added which refer to other newly added physical columns. Have you confirmed that the column backfiller doesn't go and write these things to the primary index? I'm mostly thinking that you'll want to teach the schema changer about these columns here:

case *descpb.DescriptorMutation_Column:
if tabledesc.ColumnNeedsBackfill(m.GetColumn()) {
needColumnBackfill = true
}

It'd be good to write some tests involving adding these things inside of explicit transactions.

Reviewed 3 of 4 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @lucy-zhang)

@RaduBerinde
Copy link
Member Author

Thanks! Any idea how I can reliably test that it doesn't get backfilled? Is it possible to trace the schema change?

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to trace the schema change?

Not really, at least not in the way you might hope to. We could change that but it'd be in the guts of the jobs infra. I think the best bet is likely to use the schema changer testing knobs and just assert that some of the hooks don't fire. It's far from elegant, but hey, one day we'll have a better answer for you.

// SchemaChangerTestingKnobs for testing the schema change execution path
// through both the synchronous and asynchronous paths.
type SchemaChangerTestingKnobs struct {
// SchemaChangeJobNoOp returning true will cause the job to be a no-op.
SchemaChangeJobNoOp func() bool
// RunBeforePublishWriteAndDelete is called just before publishing the
// write+delete state for the schema change.
RunBeforePublishWriteAndDelete func()
// RunBeforeBackfill is called just before starting the backfill.
RunBeforeBackfill func() error
// RunAfterBackfill is called after completing a backfill.
RunAfterBackfill func(jobID int64) error
// RunBeforeQueryBackfill is called before a query based backfill.
RunBeforeQueryBackfill func() error
// RunBeforeIndexBackfill is called just before starting the index backfill, after
// fixing the index backfill scan timestamp.
RunBeforeIndexBackfill func()
// RunBeforeMaterializedViewRefreshCommit is called before committing a
// materialized view refresh.
RunBeforeMaterializedViewRefreshCommit func() error
// RunBeforePrimaryKeySwap is called just before the primary key swap is committed.
RunBeforePrimaryKeySwap func()
// RunBeforeComputedColumnSwap is called just before the computed column swap is committed.
RunBeforeComputedColumnSwap func()
// RunBeforeChildJobs is called just before child jobs are run to clean up
// dropped schema elements after a mutation.
RunBeforeChildJobs func()
// RunBeforeIndexValidation is called just before starting the index validation,
// after setting the job status to validating.
RunBeforeIndexValidation func() error
// RunBeforeConstraintValidation is called just before starting the checks validation,
// after setting the job status to validating.
RunBeforeConstraintValidation func() error
// RunBeforeMutationReversal runs at the beginning of maybeReverseMutations.
RunBeforeMutationReversal func(jobID int64) error
// RunAfterMutationReversal runs in OnFailOrCancel after the mutations have
// been reversed.
RunAfterMutationReversal func(jobID int64) error
// RunAtStartOfOnFailOrCancel runs at the start of the OnFailOrCancel hook.
RunBeforeOnFailOrCancel func(jobID int64) error
// RunAfterOnFailOrCancel runs after the OnFailOrCancel hook.
RunAfterOnFailOrCancel func(jobID int64) error
// RunBeforeResume runs at the start of the Resume hook.
RunBeforeResume func(jobID int64) error
// OldNamesDrainedNotification is called during a schema change,
// after all leases on the version of the descriptor with the old
// names are gone, and just before the mapping of the old names to the
// descriptor id are about to be deleted.
OldNamesDrainedNotification func()
// WriteCheckpointInterval is the interval after which a checkpoint is
// written.
WriteCheckpointInterval time.Duration
// BackfillChunkSize is to be used for all backfill chunked operations.
BackfillChunkSize int64
// AlwaysUpdateIndexBackfillDetails indicates whether the index backfill
// schema change job details should be updated everytime the coordinator
// receives an update from the backfill processor.
AlwaysUpdateIndexBackfillDetails bool
// AlwaysUpdateIndexBackfillProgress indicates whether the index backfill
// schema change job fraction completed should be updated everytime the
// coordinator receives an update from the backfill processor.
AlwaysUpdateIndexBackfillProgress bool
// TwoVersionLeaseViolation is called whenever a schema change transaction is
// unable to commit because it is violating the two version lease invariant.
TwoVersionLeaseViolation func()
}

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @lucy-zhang)

@RaduBerinde
Copy link
Member Author

It'd be good to write some tests involving adding these things inside of explicit transactions.

What behavior am I testing for? If we add a column, it's not visible inside the explicit transaction AFAICT. Am I just testing that if the transaction commits, then the changes are visible?

@RaduBerinde
Copy link
Member Author

Refactored ColumnNeedsBackfill. Still working on more tests.

@ajwerner
Copy link
Contributor

It'd be good to write some tests involving adding these things inside of explicit transactions.

What behavior am I testing for? If we add a column, it's not visible inside the explicit transaction AFAICT. Am I just testing that if the transaction commits, then the changes are visible?

The column visibility rules during DDLs in cockroach are totally wild. Namely, mutation columns are visible for DDLs. For example:

 CREATE TABLE foo (i INT PRIMARY KEY);
 BEGIN;
      ALTER TABLE foo ADD COLUMN j INT;
      CREATE INDEX ON foo (j, i);
      ALTER TABLE foo ADD COLUMN k INT AS (j*j + i) STORED;
      CREATE INDEX ON foo(k);
 COMMIT;

@RaduBerinde
Copy link
Member Author

Interesting. I can't test that right now since we don't yet support indexes on these columns (nor can virtual columns depend on other virtual columns), but we'll add a test with indexes when we can.

@ajwerner
Copy link
Contributor

Interesting. I can't test that right now since we don't yet support indexes on these columns (nor can virtual columns depend on other virtual columns), but we'll add a test with indexes when we can.

Do we support check constraints?

@RaduBerinde
Copy link
Member Author

Do we support check constraints?

No, it's an open task to either disallow or support (tracked in #57608). We'll probably disallow in 21.1 because we're running short on time.

@RaduBerinde
Copy link
Member Author

RFAL. Added a test that verifies no backfill happens.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: just the comment nit that might later cause confusion

Reviewed 1 of 2 files at r2, 4 of 4 files at r3, 1 of 1 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/sql/catalog/tabledesc/structured.go, line 3788 at r3 (raw file):

		// Virtual columns can only be used as secondary index keys; as such they do
		// not need backfill in other indexes.
		// TODO(radu): revisit this if/when we allow STORING virtual columns.

Even if we allowed storing of virtual columns, this method is really about whether it needs a column backfill on the primary index and wouldn't be governed here.

Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/sql/catalog/tabledesc/structured.go, line 3788 at r3 (raw file):

Previously, ajwerner wrote…

Even if we allowed storing of virtual columns, this method is really about whether it needs a column backfill on the primary index and wouldn't be governed here.

I was thinking that the primary index could also STORE a virtual column. But then again that would also have to be a schema change and maybe they can't happen together (?)

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/sql/catalog/tabledesc/structured.go, line 3788 at r3 (raw file):

Previously, RaduBerinde wrote…

I was thinking that the primary index could also STORE a virtual column. But then again that would also have to be a schema change and maybe they can't happen together (?)

How would that be different from STORED computed columns?

Support adding and removing virtual columns (when no indexes on those
columns are in play).

Note that because these columns don't need backfill in existing
indexes, they don't need special WriteOnly / DeleteOnly logic (other
than having the right visibility).

Release note: None
Copy link
Member Author

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)


pkg/sql/catalog/tabledesc/structured.go, line 3788 at r3 (raw file):

Previously, ajwerner wrote…

How would that be different from STORED computed columns?

Hmm good question, you're right. Updated comment.

@RaduBerinde
Copy link
Member Author

bors r+

@craig
Copy link
Contributor

craig bot commented Jan 28, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jan 28, 2021

Build succeeded:

@craig craig bot merged commit 6510d87 into cockroachdb:master Jan 28, 2021
@RaduBerinde RaduBerinde deleted the add-virt-col branch February 3, 2021 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants