-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: declarative, state-based schema changer #59288
Conversation
458d61f
to
393d3fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. I think it's worth soliciting some more eyes even without the reference level description, even even it ends up just being an FYI / skim for them.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @lucy-zhang)
docs/RFCS/20210115_new_schema_changer.md, line 55 at r1 (raw file):
- Phase: Context in which the execution of an op occurs. For today's schema changes, the possible phases are: - statement execution - pre-commit (of the user transaction)
we use "user transaction" casually in conversation but it might not be obvious to a reader. suggest adding a bullet defining "user transaction" in this terminology section.
docs/RFCS/20210115_new_schema_changer.md, line 135 at r1 (raw file):
Link to Figure 2: [Output of `EXPLAIN (DDL)`](https://cockroachdb.github.io/scplan/viz.html#H4sIAAAAAAAA/+xYUW/iOBB+hl9R5fHUbokDFPYgUkqoFB1Hqy3b1ak6IZO4xWrq5GJn93qn/e8nh4SY4oDTBB2V+sjMZPx58s3MFzz8GMFweXLyb7NB48Xqh+vHlKFoTnVultj7ib3hwwXyhxpl8BGdtLRfuY3orfvU3vpsjyfj2Xh+PZ38of2ZuvXMrUvdIHODzzdfLyfOKPU0G42fUiy6IQGjZ2DaW2CsqT3/9sV5fW5nC1ZBYLc0wAsJQJAB7KkC7OcAL63Rb1fOZDK2My9olUUFdAkqI0UFgCIqYOSo7qyJY1szAVS7NKiOBFQ7A9XNQW2kI+AiR/HK08sh7LpIMSIZ1TspIqNVhMjQixAZYAuRCg5DxvJuhqNdiKNTiEPgsXV5O57O9pZiAwGD0SNiNENw/xAQRvE/aKj1tdNVzGDA4MJHJ64PKR1qQcxQpJkDFpkD5mXWJYIeN/9C3XDxaRT48TMZnDPPHJwngUmwuZkp+bHOZLrCQ/tisbeOA9vHEPiM1v6FxB/7Ps+3jmFRjLbD2EuI1OA8wGfsv6xjHcKuBIuYMxCRt7b9P7DHluuIbluIOE8ONXdaNvBMxTKEEX6GkQRRksPJUXXArhOamalpntIlDNFQIwFBGR3bNRHoZoXWIR76uySNcP6MGuNsHCGX4YDQgqe+Qz9/xrodqbyBVWrHU8qpq2fk71QpJ1TJWaKNCPoxTzk0f0ISHsUE/xXva6nvKKI4INJTC1EGbImibUa8KlwaTFkQoVGZ6ithENIqv4KFSuJa26/z0X7vtP0kTCZK4/v/aDvJrKiTxql0ETXSSqAQQTqKYocIwnHD3pMr1ybP32wQ48zMU1q2rZ1S9uKjoeZBukReEkzaZ2aevyioc2bmh9lfrm+kUd0zk39XVe1R6gbhp9/hE7I8D3mroWQj6kY4ZEFkIx8xdE1k6mNnO45K6C/H3kWG6R79NVXTX7N3q7+uSuivWdI5dvnOSSh1wSml10yppNvfzCNHfQOMDrcBUglg17cBSm5/pQ0wJm7gYfK4QXXJLijRb3uF2q2gkNTKo7Qubg+lkL6qbLi7t2y4aq3X460H7rM5HzCWzXm9xV3Vv4VUBr1FvG8RrjDwd1OrWol0nReiuiotHE8Vbp9kOejlAb98V0oQLir0ntTFpYTer6dkyR0vofv0gH3/6OrD5RBoSYvAewvIe6vPXdX/qErqsxotyU3voI89yNYzpM46iSJaiJTM+WoVBXzugLasbICTEXTr68TX0+gmXvjYPboJBLjGBxc1TyDxhb7p4h8aSV0j1al9jlVLAL4QQK8+mtpREIabRK1nY36Q9lDCXuJV+4I8WlLzBW9IFzzg362GLnXx3W+A2luhho/aQ+5ug+9uQ7q7Da6ijY7Uxde6UeNaT+5pLSgi7Mgq1PzZ/C8AAP//g35tdeYgAAA=) Op edges are again represented by arrows with solid black lines. Each set of op edges in a stage correspond exactly to the set of ops to be executed in that stage. Targets which do not undergo any change within a stage, because they are "waiting" for dependencies to be satisfied in a later stage, have arrows with dotted lines within that stage. Note that the generated stages are compatible with each dependency edge indicated in the output of `EXPLAIN (DDL, DEPS)` (Figure 1).
I'm a little confused even after reading this - is explain(ddl) a subset of explain(ddl,deps)? or a different thing entirely?
I ended up reading this RFC draft after having read the code in #58979 and what is explained here corresponds neatly and clearly to my understanding of the code 👍 |
393d3fa
to
2b03f0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed some more detailed notes I have.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis)
docs/RFCS/20210115_new_schema_changer.md, line 55 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
we use "user transaction" casually in conversation but it might not be obvious to a reader. suggest adding a bullet defining "user transaction" in this terminology section.
Done.
docs/RFCS/20210115_new_schema_changer.md, line 135 at r1 (raw file):
Previously, jordanlewis (Jordan Lewis) wrote…
I'm a little confused even after reading this - is explain(ddl) a subset of explain(ddl,deps)? or a different thing entirely?
I added a few words. Does that help? Neither is a subset of the other.
I just tagged a bunch of SQL and SQL-adjacent people. This RFC is not quite finished but we thought it'd be good to have some eyes on it. There is an accompanying PR: #58979 |
58979: schemachanger,*: introduce new schema changer r=lucy-zhang a=ajwerner This PR introduces the new schema changer, for which a partially written RFC (including a "guide-level explanation") can be found in #59288. It contains a somewhat complete implementation for a subset of `ALTER TABLE ... ADD COLUMN`. Like the existing implementation, these schema changes are planned and partially executed in the user transaction, but most of the work happens in an asynchronous job that runs after the transaction has committed. We use the index backfiller to build new columns, as described in #47989. See the RFC and individual commits for details. Missing things that will be in later PRs: - Making execution changes to support concurrent writes for columns added using the index backfiller. We currently don't have a way to write column values which are not part of the primary index. See #59149. (#58990 takes care of the backfill part). - Ensuring compatible coexistence with the old schema changer; in particular, polling for queued mutations to finish before any schema changes can start, using child transactions within the user transaction (#56588). The new schema changer assumes it has exclusive rights to perform schema changes on the table, which anticipates the transactional schema changes of the future. - Reverting schema changes when failed or canceled. - Properly checkpointing and reading job progress for the backfill. Currently there's a placeholder implementation that doesn't interact with the job at all. Co-authored-by: Andrew Werner <[email protected]>
This is really cool! Very nice work! I suggest adding a testing mode where a mock "executor" simply records information about each step and then dumps the list out as text. Then you can have datadriven tests for the "planning" part. |
Tangential: every time I see "declarative schema changer" I imagine being able to declare the state of my schema (i.e. here is the way I want my table to look) and have the system figure out the series of table/index/column/constraint adds/drops rather than having to specify them imperatively. |
Not quite, however, the builder concepts below the builder are quite amenable to the idea. Something like:
as equivalent to:
Fun idea. Don't know what getting this sort of idea even out there for standardization would look like but changing sql syntax is far from our goal. |
Never mind, I looked at the PR and you already do this (scbld tests). |
Standardization would be way off. Perhaps a tool that CRDB provides: |
I've recently stumbled across a tool that diffs two postgres schemas (on a live PG instance instead of two .sql files) and outputs the alter statements: https://databaseci.com/docs/migra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding a testing mode where a mock "executor" simply records information about each step and then dumps the list out as text. Then you can have datadriven tests for the "planning" part.
Hmm, I feel like I'm missing something but about how this is distinctly different from just dumping out the ops in a datadriven test? Like why get the "executor" involved at all? It's possible that we're saying the same thing.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @dt, @jordanlewis, @otan, @postamar, @RaduBerinde, @rafiss, and @yuzefovich)
Took a long-ish skim -- this work looks very exciting and much nicer to work with, looking forward to it! The state transition are pretty awesome and seem useful for debugging too! |
83c333b
to
7f3f5fc
Compare
0d3301d
to
9a06e30
Compare
I filled out the "Reference-level explanation." This is finished as a first draft, and is ready for another look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @arulajmani, @dt, @jordanlewis, @lucy-zhang, @postamar, @RaduBerinde, @rafiss, and @yuzefovich)
docs/RFCS/20210115_new_schema_changer.md, line 20 at r3 (raw file):
Schema changes in CRDB have always been guided by the state-based abstraction, but their current implementation has serious deficiencies, some of which make the implementation unsuitable for the impending introduction of [transactional schema changes](https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/20200909_transactional_schema_changes.md). The current schema changer has hard-coded states and transitions that best accomodate a single column or index change, and is fundamentally limited in its ability to support other schema changes and their combinations. This has become untenable as the number of supported DDL statements has grown. Problems with the schema changer which threaten the correctness and stability of schema changes, both current and future, include:
I could see a point about coupling and modularity. Most code today has to live in the SQL package make it hard to leverage in testing.
docs/RFCS/20210115_new_schema_changer.md, line 90 at r3 (raw file):
```go scpb.Node{ Target: &scpb.Column{
I think this is missing a direction.
docs/RFCS/20210115_new_schema_changer.md, line 303 at r3 (raw file):
- More clarity is needed on the semantics of locking vs. non-locking updates in the context of concurrent schema changes. - A related question is [how to deal with `DROP TABLE`](https://github.com/cockroachdb/cockroach/issues/61256), and its relative `TRUNCATE TABLE`, in the presence of ongoing schema changes.
one particular challenge here is that the schema change state is only present on the job which makes applying it especially awkward.
9565d67
to
fc39ffc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @arulajmani, @dt, @jordanlewis, @postamar, @RaduBerinde, @rafiss, and @yuzefovich)
docs/RFCS/20210115_new_schema_changer.md, line 20 at r3 (raw file):
Previously, ajwerner wrote…
I could see a point about coupling and modularity. Most code today has to live in the SQL package make it hard to leverage in testing.
Done. I folded this into the "difficult of testing" point.
docs/RFCS/20210115_new_schema_changer.md, line 90 at r3 (raw file):
Previously, ajwerner wrote…
I think this is missing a direction.
I kind of screwed this up overall, but it should be fixed.
docs/RFCS/20210115_new_schema_changer.md, line 303 at r3 (raw file):
Previously, ajwerner wrote…
one particular challenge here is that the schema change state is only present on the job which makes applying it especially awkward.
Done. I think that's actually an important larger point, that the source of truth for the schema change state is now in the job, so I added a sentence elsewhere.
fc39ffc
to
ee8a5f4
Compare
Another point that I don't really see represented here because it's not really represented in code is this idea that there are a number of optimizations we may want to perform to the set of operations we need to traverse. One challenge there is that some of these optimizations may need to know about elements in the schema that are not being changed. Some of these optimizations seem rather important IIRC. |
Release justification: Non-production code change Release note: None
ee8a5f4
to
32f3e6c
Compare
Good catch. I added a subsection about this. |
Would anyone like to take more time to read this? Feel free to say yes, because feedback is always appreciated. But if not, I'll plan to merge this sometime tomorrow (pending final sign-off from @jordanlewis). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for pushing this over the finish line. This will be useful as a guide, reference and artifact of our decisionmaking for years to come.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @arulajmani, @dt, @jordanlewis, @postamar, @RaduBerinde, @rafiss, and @yuzefovich)
TFTRs bors r+ |
Build succeeded: |
I would really appreciate this feature. I've requested similar before. https://forum.cockroachlabs.com/t/declarative-schema-migrations/3334 |
https://github.com/lucy-zhang/cockroach/blob/schema-changer-rfc/docs/RFCS/20210115_new_schema_changer.md
Release note: none