Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add a proposal for multi-schema change #34037

Merged
merged 32 commits into from
Jun 15, 2022
Merged
Changes from 2 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
9e486a0
docs: add a proposal doc for multi-schema change
tangenta Apr 15, 2022
facd66d
remove the cascading exception
tangenta Apr 15, 2022
95803cf
address comment
tangenta Apr 18, 2022
e2ae63f
polish the words with grammarly
tangenta Apr 18, 2022
a1f67df
address comments
tangenta Apr 18, 2022
561d8dc
remove the feature part
tangenta Apr 18, 2022
7f68a9f
correct non-reorg modify column states
tangenta Apr 18, 2022
a2ee2c2
add upgrade compatibility
tangenta Apr 22, 2022
5e30b50
Update docs/design/2022-04-15-multi-schema-change.md
tangenta May 5, 2022
35b4a70
Update docs/design/2022-04-15-multi-schema-change.md
tangenta May 5, 2022
120133f
Update docs/design/2022-04-15-multi-schema-change.md
tangenta May 5, 2022
33fdb87
Update docs/design/2022-04-15-multi-schema-change.md
tangenta May 5, 2022
1181041
Update docs/design/2022-04-15-multi-schema-change.md
tangenta May 5, 2022
676a3bb
Update docs/design/2022-04-15-multi-schema-change.md
tangenta May 5, 2022
1187118
Update docs/design/2022-04-15-multi-schema-change.md
tangenta May 5, 2022
6d8bf5e
add some details
tangenta May 6, 2022
c292f49
polish the words
tangenta May 6, 2022
b9172d4
add future work part
tangenta May 10, 2022
003d0e0
Update docs/design/2022-04-15-multi-schema-change.md
tangenta Jun 6, 2022
89777fb
Update docs/design/2022-04-15-multi-schema-change.md
tangenta Jun 6, 2022
80215d9
add goal/non-goals, data structure and job execution explaination
tangenta Jun 7, 2022
98a80ba
polish the words
tangenta Jun 7, 2022
730f4a2
Update docs/design/2022-04-15-multi-schema-change.md
tangenta Jun 8, 2022
ed0def9
Update docs/design/2022-04-15-multi-schema-change.md
tangenta Jun 8, 2022
b75aeab
Update docs/design/2022-04-15-multi-schema-change.md
tangenta Jun 8, 2022
416dd53
Merge branch 'master' into msc-design
tangenta Jun 8, 2022
9978ec5
Update 2022-04-15-multi-schema-change.md
tangenta Jun 8, 2022
699950f
Update 2022-04-15-multi-schema-change.md
tangenta Jun 8, 2022
34bb8ae
Merge branch 'master' into msc-design
ti-chi-bot Jun 15, 2022
918b144
Merge branch 'master' into msc-design
ti-chi-bot Jun 15, 2022
2224121
Merge branch 'master' into msc-design
ti-chi-bot Jun 15, 2022
976f582
Merge branch 'master' into msc-design
ti-chi-bot Jun 15, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 76 additions & 7 deletions docs/design/2022-04-15-multi-schema-change.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,19 +25,56 @@ When users attempt to migrate data from MySQL-like databases, they may expend ad

Above all, the lack of this capability can be a blocking issue for those who wish to use TiDB.

### Goal

- Support MySQL-compatible Multi-Schema Change that used commonly, including `ADD/DROP COLUMN`, `ADD/DROP INDEX`, `MODIFY COLUMN`, `RENAME COLUMN`, etc.

### Non-goals

- Support TiDB-specific Multi-Schema Change like `ADD TIFLASH REPLICA`, `ADD PARTITION`, `ALTER PARTITION`, etc.
- Resolve the 'schema is changed' error when DDL and DML are executed concurrently.
- Be 100% compatible with MySQL. MySQL may reorder the execution of schema changes, which makes the behavior counter-intuitive.
tangenta marked this conversation as resolved.
Show resolved Hide resolved

tangenta marked this conversation as resolved.
Show resolved Hide resolved
## Proposal

### Data Structure

The implementation is based on the [online DDL architecture](https://github.com/pingcap/tidb/blob/e0c461a84cf4ad55c7b51c3f9db7f7b9ba51bb62/docs/design/2018-10-08-online-DDL.md). Similar to the existing [Job](https://github.com/pingcap/tidb/blob/6bd54bea8a9ec25c8d65fcf1157c5ee7a141ab0b/parser/model/ddl.go/#L262) structure, we introduce a new structure "SubJob":

- "Job": A job is generally the internal representation of one DDL statement.
- "SubJob": A sub-job is a representation of one DDL schema change. A job may contain zero(when multi-schema change is not applicable) or more sub-jobs.

```go
// Job represents a DDL action.
type Job struct {
Type ActionType `json:"type"`
State JobState `json:"state"`
// ...
MultiSchemaInfo *MultiSchemaInfo `json:"multi_schema_info"`
}

// MultiSchemaInfo contains information for multi-schema change.
type MultiSchemaInfo struct {
// ...
SubJobs []SubJob `json:"sub_jobs"`
}

// SubJob represents one schema change in a multi-schema change DDL.
type SubJob struct {
Type ActionType `json:"type"`
State JobState `json:"state"`
// ...
}
```

The field `ActionType` stands for the type of DDL. For example, `ADD COLUMN` is mapped to `ActionAddColumn`; `MODIFY COLUMN` is mapped to `ActionModifyColumn`.

The Multi-Schema Change DDL jobs have the type `ActionMultiSchemaChange`. In the current worker model, there is a dedicated code path (`onMultiSchemaChange()`) to run these jobs. Only Multi-Schema Change jobs can have sub-jobs.
tangenta marked this conversation as resolved.
Show resolved Hide resolved

For example, the DDL statement

```SQL
ALTER TABLE t ADD COLUMN b INT, MODIFY COLUMN a CHAR(255);
ALTER TABLE t ADD COLUMN b INT, MODIFY COLUMN a CHAR(10);
```

can be modeled as a job like
Expand All @@ -60,11 +97,43 @@ job := &Job {
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add a description on 'add columns with indexes' and 'drop columns covered by indexes'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't support 'add columns with indexes', and 'drop columns covered by indexes' is not special in multi-schema change.


In this way, we pack multiple schema changes into one job. Like any other job, it enqueue the DDL job queue stored in the storage and waits for an appropriate worker to pick it up and process it.
In this way, we pack multiple schema changes into one job. Like any other job, it enqueue the DDL job queue to the persistent storage and waits for an appropriate worker to pick it up and process it.

Normally, the worker executes the sub-jobs one by one serially as if they were plain jobs. However, in abnormal cases, things become complex.
### Job/Sub-job Execution

As shown in the code above, there is a field `State` in both `Job` and `SubJob`. All the possible states and the changes are listed here:

```
┌-->--- Done ->------------------┐
| |
None -> Running -> Rollingback -> RollbackDone -> Synced
| |
└-->--- Cancelling -> Cancelled -->-┘
```

We can divided these states into four types:

| States | Normal | Abnormal |
|-----------------|-------------------|-----------------------------|
| **Uncompleted** | `None`, `Running` | `Rollingback`, `Cancelling` |
| **Completed** | `Done` | `RollbackDone`, `Cancelled` |

Since a `Job` is executed by a DDL worker, the sub-jobs are executed in a single thread. The general principal to select a sub-job is as follows:

- For the normal state, the first uncompleted sub-job is selected in ascending order, i.e., the sub-job with the smaller order number is executed first.
- For the abnormal state, the first uncompleted sub-job is selected in descending order, i.e., the sub-job with the larger order number is executed first.

When one of the sub-job becomes abnormal, the parent job and all the other sub-jobs are changed to an abnormal state.

### Schema Object State Management

To ensure atomic execution of Multi-Schema Change, we need to carefully manage the states of the changing schema objects. Let's take the above SQL as an example:

```SQL
ALTER TABLE t ADD COLUMN b INT, MODIFY COLUMN a CHAR(10);
```

To ensure atomic execution of a Multi-Schema Change execution, we need to carefully manage the states of the changing schema objects. Let's take the above SQL as an example: If the second sub-job `MODIFY COLUMN a CHAR (255)` fails for some reason, the first sub-job should be able to roll back its changes (roll back the added column `b`).
If the second sub-job `MODIFY COLUMN a CHAR (10)` fails for some reason(e.g., a row has more than ten characters), the first sub-job should be able to roll back its changes (roll back the added column `b`).

This requirement means that we cannot simply publish the schema object when a sub-job is finished. Instead, it should remain in a state invisible to users, waiting for the other sub-jobs to complete, eventually publishing all at once when it is confirmed that all sub-jobs have succeeded. This method is similar to 2PC: the "commit" cannot be started until the "prewrites" are completed.

Expand All @@ -79,11 +148,11 @@ Here is the table of schema states that can occur in different DDLs. Note that t
| Non-reorg Modify Column | Public (before meta change) | Public (after meta change) |
| Reorg Modify Column | None, Delete-Only, Write-Only, Write-Reorg | Public |

To achieve this behavior, we introduce a flag named "non-revertible" in the sub-job. This flag is set when a schema object has reached the last revertible state. When all sub-jobs are non-revertible, all associated schema objects change to the next state in one transaction. After that, the sub-jobs are executed serially to do the rest.
To achieve this behavior, we introduce a flag named "non-revertible" in the sub-job. This flag is set when a schema object has reached the last revertible state. A sub-job with this flag is considered temporary completed, so that the worker can select the next sub-job. When all sub-jobs are non-revertible, all associated schema objects change to the next state in one transaction. After that, the sub-jobs are executed serially to do the rest.

On the other hand, if there is an error returned by any sub-job before all of them become non-revertible, the entire job is placed in to a `rollingback` state. For the executed sub-jobs, we set them to `cancelling`; for the unexecuted sub-jobs, we set them to `cancelled`.
On the other hand, if there is an error returned by any sub-job before all of them become non-revertible, the entire job is placed in to a `Rollingback` state. For the executed sub-jobs, we set them to `Cancelling`; for the unexecuted sub-jobs, we set them to `Cancelled`.

Finally, we consider the extreme case: an error occurs while all the sub-jobs are non-revertible. In this situation, we tend to assume that the error can be resolved in a trivial way, e.g., by retrying. This behavior is consistent with the current DDL implementation. Let's take `DROP COLUMN` as an example: Once the column enters the "Write-Only" state, there is no way to abort this job.
Finally, we consider the extreme case: an error occurs while all the sub-jobs are non-revertible. There are two kinds of errors in general, the logical error(such as the violation of unique constraint, out-of-range data) and the physical error(such as unavailablity of the network, unusability of the storage). In this situation, the error is guaranteed to be a physical one: we tend to assume that it can be resolved in a trivial way, e.g., by retrying. This behavior is consistent with the current DDL implementation. Take `DROP COLUMN` as an example, once the column enters the "Write-Only" state, there is no way to abort this job.

## Compatibility
tangenta marked this conversation as resolved.
Show resolved Hide resolved

Expand Down