-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: support adding virtual columns #59401
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that because these columns don't need backfill in existing
indexes, they don't need special WriteOnly / DeleteOnly logic (other
than having the right visibility).
This note is interesting in that nevertheless we do move these virtual columns through those states. As we've discussed elsewhere, this will be handy for cases where virtual computed columns are added which refer to other newly added physical columns. Have you confirmed that the column backfiller doesn't go and write these things to the primary index? I'm mostly thinking that you'll want to teach the schema changer about these columns here:
Lines 217 to 220 in 97d14bf
case *descpb.DescriptorMutation_Column: | |
if tabledesc.ColumnNeedsBackfill(m.GetColumn()) { | |
needColumnBackfill = true | |
} |
It'd be good to write some tests involving adding these things inside of explicit transactions.
Reviewed 3 of 4 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @lucy-zhang)
Thanks! Any idea how I can reliably test that it doesn't get backfilled? Is it possible to trace the schema change? |
c398e2d
to
2bd686d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to trace the schema change?
Not really, at least not in the way you might hope to. We could change that but it'd be in the guts of the jobs infra. I think the best bet is likely to use the schema changer testing knobs and just assert that some of the hooks don't fire. It's far from elegant, but hey, one day we'll have a better answer for you.
cockroach/pkg/sql/schema_changer.go
Lines 1872 to 1959 in f5f71df
// SchemaChangerTestingKnobs for testing the schema change execution path | |
// through both the synchronous and asynchronous paths. | |
type SchemaChangerTestingKnobs struct { | |
// SchemaChangeJobNoOp returning true will cause the job to be a no-op. | |
SchemaChangeJobNoOp func() bool | |
// RunBeforePublishWriteAndDelete is called just before publishing the | |
// write+delete state for the schema change. | |
RunBeforePublishWriteAndDelete func() | |
// RunBeforeBackfill is called just before starting the backfill. | |
RunBeforeBackfill func() error | |
// RunAfterBackfill is called after completing a backfill. | |
RunAfterBackfill func(jobID int64) error | |
// RunBeforeQueryBackfill is called before a query based backfill. | |
RunBeforeQueryBackfill func() error | |
// RunBeforeIndexBackfill is called just before starting the index backfill, after | |
// fixing the index backfill scan timestamp. | |
RunBeforeIndexBackfill func() | |
// RunBeforeMaterializedViewRefreshCommit is called before committing a | |
// materialized view refresh. | |
RunBeforeMaterializedViewRefreshCommit func() error | |
// RunBeforePrimaryKeySwap is called just before the primary key swap is committed. | |
RunBeforePrimaryKeySwap func() | |
// RunBeforeComputedColumnSwap is called just before the computed column swap is committed. | |
RunBeforeComputedColumnSwap func() | |
// RunBeforeChildJobs is called just before child jobs are run to clean up | |
// dropped schema elements after a mutation. | |
RunBeforeChildJobs func() | |
// RunBeforeIndexValidation is called just before starting the index validation, | |
// after setting the job status to validating. | |
RunBeforeIndexValidation func() error | |
// RunBeforeConstraintValidation is called just before starting the checks validation, | |
// after setting the job status to validating. | |
RunBeforeConstraintValidation func() error | |
// RunBeforeMutationReversal runs at the beginning of maybeReverseMutations. | |
RunBeforeMutationReversal func(jobID int64) error | |
// RunAfterMutationReversal runs in OnFailOrCancel after the mutations have | |
// been reversed. | |
RunAfterMutationReversal func(jobID int64) error | |
// RunAtStartOfOnFailOrCancel runs at the start of the OnFailOrCancel hook. | |
RunBeforeOnFailOrCancel func(jobID int64) error | |
// RunAfterOnFailOrCancel runs after the OnFailOrCancel hook. | |
RunAfterOnFailOrCancel func(jobID int64) error | |
// RunBeforeResume runs at the start of the Resume hook. | |
RunBeforeResume func(jobID int64) error | |
// OldNamesDrainedNotification is called during a schema change, | |
// after all leases on the version of the descriptor with the old | |
// names are gone, and just before the mapping of the old names to the | |
// descriptor id are about to be deleted. | |
OldNamesDrainedNotification func() | |
// WriteCheckpointInterval is the interval after which a checkpoint is | |
// written. | |
WriteCheckpointInterval time.Duration | |
// BackfillChunkSize is to be used for all backfill chunked operations. | |
BackfillChunkSize int64 | |
// AlwaysUpdateIndexBackfillDetails indicates whether the index backfill | |
// schema change job details should be updated everytime the coordinator | |
// receives an update from the backfill processor. | |
AlwaysUpdateIndexBackfillDetails bool | |
// AlwaysUpdateIndexBackfillProgress indicates whether the index backfill | |
// schema change job fraction completed should be updated everytime the | |
// coordinator receives an update from the backfill processor. | |
AlwaysUpdateIndexBackfillProgress bool | |
// TwoVersionLeaseViolation is called whenever a schema change transaction is | |
// unable to commit because it is violating the two version lease invariant. | |
TwoVersionLeaseViolation func() | |
} |
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @lucy-zhang)
What behavior am I testing for? If we add a column, it's not visible inside the explicit transaction AFAICT. Am I just testing that if the transaction commits, then the changes are visible? |
2bd686d
to
402dc8d
Compare
Refactored |
The column visibility rules during DDLs in cockroach are totally wild. Namely, mutation columns are visible for DDLs. For example:
|
Interesting. I can't test that right now since we don't yet support indexes on these columns (nor can virtual columns depend on other virtual columns), but we'll add a test with indexes when we can. |
Do we support check constraints? |
No, it's an open task to either disallow or support (tracked in #57608). We'll probably disallow in 21.1 because we're running short on time. |
402dc8d
to
c332cc0
Compare
RFAL. Added a test that verifies no backfill happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just the comment nit that might later cause confusion
Reviewed 1 of 2 files at r2, 4 of 4 files at r3, 1 of 1 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/sql/catalog/tabledesc/structured.go, line 3788 at r3 (raw file):
// Virtual columns can only be used as secondary index keys; as such they do // not need backfill in other indexes. // TODO(radu): revisit this if/when we allow STORING virtual columns.
Even if we allowed storing of virtual columns, this method is really about whether it needs a column backfill on the primary index and wouldn't be governed here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/sql/catalog/tabledesc/structured.go, line 3788 at r3 (raw file):
Previously, ajwerner wrote…
Even if we allowed storing of virtual columns, this method is really about whether it needs a column backfill on the primary index and wouldn't be governed here.
I was thinking that the primary index could also STORE a virtual column. But then again that would also have to be a schema change and maybe they can't happen together (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/sql/catalog/tabledesc/structured.go, line 3788 at r3 (raw file):
Previously, RaduBerinde wrote…
I was thinking that the primary index could also STORE a virtual column. But then again that would also have to be a schema change and maybe they can't happen together (?)
How would that be different from STORED
computed columns?
Support adding and removing virtual columns (when no indexes on those columns are in play). Note that because these columns don't need backfill in existing indexes, they don't need special WriteOnly / DeleteOnly logic (other than having the right visibility). Release note: None
c332cc0
to
c2f6361
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)
pkg/sql/catalog/tabledesc/structured.go, line 3788 at r3 (raw file):
Previously, ajwerner wrote…
How would that be different from
STORED
computed columns?
Hmm good question, you're right. Updated comment.
bors r+ |
Build failed (retrying...): |
Build succeeded: |
Support adding and removing virtual columns (when no indexes on those
columns are in play).
Note that because these columns don't need backfill in existing
indexes, they don't need special WriteOnly / DeleteOnly logic (other
than having the right visibility).
Release note: None