schemachanger,*: introduce new schema changer #58979

ajwerner · 2021-01-13T22:40:55Z

This PR introduces the new schema changer, for which a partially written RFC (including a "guide-level explanation") can be found in #59288. It contains a somewhat complete implementation for a subset of ALTER TABLE ... ADD COLUMN. Like the existing implementation, these schema changes are planned and partially executed in the user transaction, but most of the work happens in an asynchronous job that runs after the transaction has committed. We use the index backfiller to build new columns, as described in #47989.

See the RFC and individual commits for details.

Missing things that will be in later PRs:

Making execution changes to support concurrent writes for columns added using the index backfiller. We currently don't have a way to write column values which are not part of the primary index. See sql: enable adding columns which are not written to the current primary index #59149. (sql: deal with computed and default column generation in index backfi… #58990 takes care of the backfill part).
Ensuring compatible coexistence with the old schema changer; in particular, polling for queued mutations to finish before any schema changes can start, using child transactions within the user transaction (kv,storage: add support for child transactions #56588). The new schema changer assumes it has exclusive rights to perform schema changes on the table, which anticipates the transactional schema changes of the future.
Reverting schema changes when failed or canceled.
Properly checkpointing and reading job progress for the backfill. Currently there's a placeholder implementation that doesn't interact with the job at all.

cockroach-teamcity · 2021-01-13T22:41:03Z

This change is

ajwerner · 2021-01-28T05:14:03Z

Update: I pushed a revision which reworks the way that the schema changer connects to the connExecutor. It also reworks the setting to enable the schema changer into something more like we'd ship.

postamar

I looked at this changeset in detail, that being said my review is necessarily superficial and focusing mostly on the implementation outlined in the RFC rather than the design because I'm not very knowledgeable about the existing logic replaced by this new schema changer. I suspect that this will be the case (to varying degrees) for anyone but @lucy-zhang and @ajwerner and that's fine; anyway the RFC itself has already been the subject of plenty of rich discussion.

I found the new code easy to follow and nicely modularized. As a consequence it allows for some powerful tests like for instance the logic tests for scbuild, which I really like. I feel this is worth merging as it is and I hope this actually gets used soon in some meaningful way to uncover bugs and so that it doesn't suffer from bitrot.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, and @lucy-zhang)

pkg/sql/schema_change_plan_node.go, line 46 at r17 (raw file):

// schemaChangePlanNode is the planNode utilized by the new schema changer to
//

nit: missing comment lines.

pkg/sql/schemachanger/scpb/scpb.proto, line 62 at r17 (raw file):

  uint32 table_id = 1 [(gogoproto.customname) = "TableID", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.ID"];
  cockroach.sql.sqlbase.IndexDescriptor index = 2 [(gogoproto.nullable) = false];
  uint32 other_primary_index = 3 [(gogoproto.customname) = "OtherPrimaryIndex", (gogoproto.casttype) = "github.com/cockroachdb/cockroach/pkg/sql/catalog/descpb.IndexID"];

nit: shouldn't this read other_primary_index_id out of consistency? The same question applies to replacement_for below.

pkg/sql/schemachanger/scplan/rules.go, line 60 at r17 (raw file):

}

var rules = map[scpb.Element]targetRules{

I'm worried that laying these rules out as a map like this is going to age poorly. The literal is already one huge block with many nesting levels and with time it's only going to grow. I'm finding that it's already difficult to read and a lot of this stuff is highly sensitive, so the potential for dumb mistakes due to confusion seems really high to me. Consider what the diff of a future patch is going to look like from the point of view of the reviewer in github or reviewable: it's going to be really hard to tell whether the lines being changed are for PrimaryIndex or for SecondaryIndex, for example.

As a result I feel it's worth breaking this down into one block per key at the top level, at least. This would mitigate the potential for confusion a great deal because review tools highlight the line where a block begins (in a function definition this usually is the function name & signature). Going further, at the top level the keys of this rules map are used to dispatch on Element subtypes using reflection; perhaps this could instead be implemented with interfaces somehow?

In forward and backward I like that you list the states from ABSENT to VALIDATED in forward and backwards order respectively. Perhaps this could also be enforced in declarative.go? I wouldn't be shy about being overly generous with such "compile-time" checks. Speaking of which I'm surprised that you rely on panics there instead of having a rules_test.go of sorts that tests the validity of the rules map.

Anyway tl;dr readibility here is worth investing into some more IMHO.

pkg/sql/sessiondata/session_data.go, line 409 at r17 (raw file):

// NewSchemaChangerMode controls if and when the new schema changer is in use.
// Since 2.1, we run everything through the DistSQL infrastructure,

nit: did you mean 20.1 instead of 2.1?

Release note: None

`scpb` defines the building blocks of the schema changer's plan-related state: `Element`, `Target`, `State`, `Node`. We'll need protobuf serialization for the job record. Release note: None

`scop` defines `Op`s, which are specifications of low-level operations in schema change execution. Ops (grouped into stages) are the output of schema change planning. This package also generates interface definitions for operation visitors. This facilitates handling every `Op` implementation without missing cases. Release note: None

`scgraph` defines the `Graph`, which is constructed in the process of planning schema changes and encodes all dependencies needed to eventually construct ops and stages. Release note: None

…ments `scbuild` contains the `Builder`, which takes AST nodes of DDL statements and converts them to specifications of schema elements to be added, modified, or removed. The set of nodes built up from one or more invocations of the builder becomes the input to subsequent planning steps: building a graph, and generating ops and stages. This package reimplements much of the logic in sql/alter_table.go and its friends, but adapted for generating declarative specifications of schema elements instead of directly mutating table descriptors. This commit also introduces data-driven tests for builder output. Release note: None

`scplan` generates schema change plans from a set of initial states for targets. It does this by constructing a graph, computing all edges in the graph (either op edges or dep edges) representing ordering constraints, and then generating a sequence of stages that respects the edges' constraints. This package contains rules for generating both op edges and dep edges, specified declaratively in terms of predicates on pairs of nodes. Release note: None

`scexec` contains a framework, the `Executor, `for executing schema change operations which are specified in terms of low-level `Op`s. `scmutationexec` contains implementations of `Op`s which mutate (specific elements on) descriptors. These correspond to moving indexes, columns, etc., through the familiar non-public states. Release note: None

This commit introduces `sql.IndexBackfillPlanner`, which is a wrapper that holds dependencies for the index backfiller so that `scexec` can call it via an interface. Release note: None

This commit introduces a `NewSchemaChange` job to run the async portion of schema changes. The job reads the serialized schema change targets where the in-transaction phases left off, plans the post-commit phase of the schema change, and executes the ops. There is an incomplete implementation of backfill progress tracking in the `badJobTracker`. Release note: None

This commit adds support for execution for the statement and pre-commit phases of the schema changer to the `connExecutor`. Also adds a session/cluster setting `sql.defaults.experimental_new_schema_changer.mode`, which is `'off'` default. The setting takes on the following three values: * `'off'` - never use the new schema changer * `'on'` - use the new schema changer in supported, implicit transaction * `'unsafe_always'` - always try to use the new schema changer, return errors where not supported This code also hooks up the planning logic for the above settings. As of this commit, only `ALTER TABLE ... ADD COLUMN` is supported, and even then, only for a subset of its behaviors. Release note: None

The context in the evalContext after statement execution is somehow canceled which is highly problematic for the post statement phase. Instead, pass through the context from the current frame. Release note: None

Release note: None

This commit implements the `EXPLAIN (DDL)` and `EXPLAIN (DDL, DEPS)` statements, both of which generate the graph of a provided DDL statement and produce a URL with an encoded graphviz graph, similar to `EXPLAIN (DISTSQL)`. Just `(DDL)` produces the stages; `(DDL, DEPS)` provides the dep edges. Release note: None

thoszhang

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, and @postamar)

pkg/sql/schema_change_plan_node.go, line 46 at r17 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: missing comment lines.

Done.

pkg/sql/schemachanger/scpb/scpb.proto, line 62 at r17 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: shouldn't this read other_primary_index_id out of consistency? The same question applies to replacement_for below.

Yes. Also, replacement_for shouldn't exist. Fixed.

pkg/sql/schemachanger/scplan/rules.go, line 60 at r17 (raw file):

Previously, postamar (Marius Posta) wrote…

I'm worried that laying these rules out as a map like this is going to age poorly. The literal is already one huge block with many nesting levels and with time it's only going to grow. I'm finding that it's already difficult to read and a lot of this stuff is highly sensitive, so the potential for dumb mistakes due to confusion seems really high to me. Consider what the diff of a future patch is going to look like from the point of view of the reviewer in github or reviewable: it's going to be really hard to tell whether the lines being changed are for PrimaryIndex or for SecondaryIndex, for example.

As a result I feel it's worth breaking this down into one block per key at the top level, at least. This would mitigate the potential for confusion a great deal because review tools highlight the line where a block begins (in a function definition this usually is the function name & signature). Going further, at the top level the keys of this rules map are used to dispatch on Element subtypes using reflection; perhaps this could instead be implemented with interfaces somehow?

In forward and backward I like that you list the states from ABSENT to VALIDATED in forward and backwards order respectively. Perhaps this could also be enforced in declarative.go? I wouldn't be shy about being overly generous with such "compile-time" checks. Speaking of which I'm surprised that you rely on panics there instead of having a rules_test.go of sorts that tests the validity of the rules map.

Anyway tl;dr readibility here is worth investing into some more IMHO.

I agree that this could use some improvements in readability. The route @ajwerner and I were thinking of going down was to have some textual declarative specification of these rules and generate code based on that, maybe resembling the optgen stuff (or at least sharing some infrastructure). But that doesn't address your point about review tools knowing how to parse code and in fact makes it worse.

I'm going to leave this commit unchanged because I suspect the route forward will become clearer once we use this part of the code more.

pkg/sql/sessiondata/session_data.go, line 409 at r17 (raw file):

Previously, postamar (Marius Posta) wrote…

nit: did you mean 20.1 instead of 2.1?

This is copy-paste cruft, removed.

thoszhang · 2021-02-02T23:18:01Z

CI is green so I'm going to bors this now. For anyone watching, @ajwerner and I have been working on this together for a while and reviewing each other's code, and we feel confident enough to merge this with the setting off by default.

bors r+

craig · 2021-02-02T23:50:07Z

Build succeeded:

GitHub CI (Cockroach)

ajwerner requested a review from thoszhang January 13, 2021 22:41

thoszhang force-pushed the ajwerner/schema-changer-prototype-squash branch 2 times, most recently from df7c786 to d4454f2 Compare January 14, 2021 04:17

ajwerner force-pushed the ajwerner/schema-changer-prototype-squash branch 3 times, most recently from 0ff926e to 4a6a5a4 Compare January 16, 2021 00:13

thoszhang force-pushed the ajwerner/schema-changer-prototype-squash branch 15 times, most recently from 94d886b to 8103001 Compare January 25, 2021 23:52

thoszhang changed the title ~~[DNM][WIP] *: new schema changer rebase~~ schemachanger,*: introduce new schema changer Jan 25, 2021

thoszhang marked this pull request as ready for review January 26, 2021 00:09

thoszhang requested review from a team as code owners January 26, 2021 00:09

thoszhang requested review from postamar, jordanlewis and a team January 26, 2021 00:10

thoszhang force-pushed the ajwerner/schema-changer-prototype-squash branch from 8103001 to 5760371 Compare January 26, 2021 00:22

thoszhang force-pushed the ajwerner/schema-changer-prototype-squash branch 3 times, most recently from 89d9b57 to ae51dc1 Compare January 27, 2021 18:01

ajwerner force-pushed the ajwerner/schema-changer-prototype-squash branch from ae51dc1 to 799ba67 Compare January 28, 2021 05:12

thoszhang force-pushed the ajwerner/schema-changer-prototype-squash branch from e68a189 to f218222 Compare January 29, 2021 22:04

postamar suggested changes Feb 2, 2021

View reviewed changes

postamar mentioned this pull request Feb 2, 2021

rfc: declarative, state-based schema changer #59288

Merged

sql/catalog: add a DescriptorIDSet in the image of ColumnIDSet

de5f8e7

Release note: None

thoszhang force-pushed the ajwerner/schema-changer-prototype-squash branch from 6ace9f9 to 2a8a161 Compare February 2, 2021 22:04

ajwerner added 14 commits February 2, 2021 17:20

schemachanger/scpb: introduce protobufs for schemachanger state

a1decc6

`scpb` defines the building blocks of the schema changer's plan-related state: `Element`, `Target`, `State`, `Node`. We'll need protobuf serialization for the job record. Release note: None

schemachanger/scgraph: add library for schema change plan graphs

697d609

`scgraph` defines the `Graph`, which is constructed in the process of planning schema changes and encodes all dependencies needed to eventually construct ops and stages. Release note: None

sql: add an adapter to the index backfiller for use in scexec

cd76842

This commit introduces `sql.IndexBackfillPlanner`, which is a wrapper that holds dependencies for the index backfiller so that `scexec` can call it via an interface. Release note: None

sql: explicitly pass a context to QueueJob

249b43e

The context in the evalContext after statement execution is somehow canceled which is highly problematic for the post statement phase. Instead, pass through the context from the current frame. Release note: None

sql/rowexec: move code used only for testing to test file

f157c66

Release note: None

sql/logictest: add new_schema_changer test file

7542652

Release note: None

schemachanger/scgraphviz: library to generate graphviz graphs

b09e19a

Release note: None

thoszhang force-pushed the ajwerner/schema-changer-prototype-squash branch from 2a8a161 to a719fda Compare February 2, 2021 22:23

thoszhang reviewed Feb 2, 2021

View reviewed changes

craig bot merged commit 1696ce3 into cockroachdb:master Feb 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schemachanger,*: introduce new schema changer #58979

schemachanger,*: introduce new schema changer #58979

ajwerner commented Jan 13, 2021 •

edited by thoszhang

Loading

cockroach-teamcity commented Jan 13, 2021

ajwerner commented Jan 28, 2021

postamar left a comment

thoszhang left a comment

thoszhang commented Feb 2, 2021

craig bot commented Feb 2, 2021

schemachanger,*: introduce new schema changer #58979

schemachanger,*: introduce new schema changer #58979

Conversation

ajwerner commented Jan 13, 2021 • edited by thoszhang Loading

cockroach-teamcity commented Jan 13, 2021

ajwerner commented Jan 28, 2021

postamar left a comment

Choose a reason for hiding this comment

thoszhang left a comment

Choose a reason for hiding this comment

thoszhang commented Feb 2, 2021

craig bot commented Feb 2, 2021

ajwerner commented Jan 13, 2021 •

edited by thoszhang

Loading