Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: allow altering the auto_increment id 'back' to a small value #25400

Closed
bb7133 opened this issue Jun 15, 2021 · 8 comments · Fixed by #25868
Closed

ddl: allow altering the auto_increment id 'back' to a small value #25400

bb7133 opened this issue Jun 15, 2021 · 8 comments · Fixed by #25868
Labels
type/feature-request Categorizes issue or PR as related to a new feature.

Comments

@bb7133
Copy link
Member

bb7133 commented Jun 15, 2021

Feature Request

Is your feature request related to a problem? Please describe:
For now, the auto ID in TiDB grows only and can never bet set 'back':

tidb> create table t(a bigint key auto_increment);
Query OK, 0 rows affected (0.02 sec)

tidb> insert into t values (), ();
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

tidb> select * from t;
+---+
| a |
+---+
| 1 |
| 2 |
+---+
2 rows in set (0.00 sec)

tidb> insert into t values (9223372036854775807);
Query OK, 1 row affected (0.00 sec)

tidb> insert into t values ();
ERROR 1467 (HY000): Failed to read auto-increment value from storage engine
tidb> delete from t where a=9223372036854775807;
Query OK, 1 row affected (0.01 sec)

tidb> alter table t auto_increment=3;
Query OK, 0 rows affected (0.03 sec)

tidb> insert into t values ();                                                                                                          
ERROR 1467 (HY000): Failed to read auto-increment value from storage engine

Although the ALTER TABLE statement returns success, the auto ID is not rebased back to 3, see #5472 (comment).

If the auto ID is rebased to a very large value because of either mistake or bug, it is almost impossible to set it back, this can be a headache for the users.

Describe the feature you'd like:
We can provide a system variable named like 'tidb_enable_unsafe_auto_id_rebase' to allow 'forcing' the auto ID to a smaller value.

Describe alternatives you've considered:
For now, we do not have the alternative, the only way is recreating the table and importing the data.

Teachability, Documentation, Adoption, Migration Strategy:
See #25355 for the hacking PR.

@bb7133 bb7133 added the type/feature-request Categorizes issue or PR as related to a new feature. label Jun 15, 2021
@tangenta
Copy link
Contributor

I prefer to add a new syntax. Since this operation is unlikely to be done in business application code, I don't think a new session variable is needed:

ALTER TABLE `t` FORCE AUTO_INCREMENT = 3;
ALTER TABLE `t` /*T![force_auto_inc] FORCE */ AUTO_INCREMENT = 3;

There is another question: if the table contains an auto_id value that larger than the target value, how should TiDB handle it?

MyISAM and InnoDB handle it differently:

Note that you cannot reset the counter to a value less than or equal to any that have already been used. For MyISAM, if the value is less than or equal to the maximum value currently in the AUTO_INCREMENT column, the value is reset to the current maximum plus one. For InnoDB, if the value is less than the current maximum value in the column, no error occurs and the current sequence value is not changed.

@bb7133
Copy link
Member Author

bb7133 commented Jun 18, 2021

I prefer to add a new syntax. Since this operation is unlikely to be done in business application code, I don't think a new session variable is needed:

ALTER TABLE `t` FORCE AUTO_INCREMENT = 3;
ALTER TABLE `t` /*T![force_auto_inc] FORCE */ AUTO_INCREMENT = 3;

Sounds reasonable!

There is another question: if the table contains an auto_id value that larger than the target value, how should TiDB handle it?

I do not think we need to worry about it since it is a 'force rebase'. If there's a conflict because of the larger values, so be it.

@kennytm
Copy link
Contributor

kennytm commented Jun 30, 2021

Does this option exist for sequences too?

@tangenta
Copy link
Contributor

tangenta commented Jun 30, 2021

Does this option exist for sequences too?

@kennytm Nope. I think we can drop and recreate the sequence to achieve the same purpose at a low cost.

I do not think we need to worry about it since it is a 'force rebase'. If there's a conflict because of the larger values, so be it.

@bb7133 In the previous implementation, TiDB does not handle the duplicated _tidb_row_id correctly, because the "conflict" rows are overwritten:

drop table if exists t;
create table t (a int);
insert into t values (1), (2), (3);
select a, _tidb_rowid from t;
+------+-------------+
| a    | _tidb_rowid |
+------+-------------+
|    1 |           1 |
|    2 |           2 |
|    3 |           3 |
+------+-------------+
alter table t force auto_increment = 1;
insert into t values (10000);
select a, _tidb_rowid from t;
+-------+-------------+
| a     | _tidb_rowid |
+-------+-------------+
| 10000 |           1 |
|     2 |           2 |
|     3 |           3 |
+-------+-------------+

Is it appropriate to simply return the "duplicate entry" error?

ERROR 1062 (23000): Duplicate entry '1' for key '_tidb_rowid'

@tangenta
Copy link
Contributor

tangenta commented Jul 1, 2021

After investigation, I found that detecting _tidb_rowid conflicts may decrease the INSERT performance because we need to Get the key first.

Since FORCE already indicates the operation can be dangerous(INSERT is actually an UPDATE), any user should stop the write workload first. Let's emphasize it in the documentation and ignore this problem.

@kennytm
Copy link
Contributor

kennytm commented Jul 1, 2021

Maybe FORCE AUTO_INCREMENT (when SHARD_ROW_ID_BITS == 0) can do SELECT MAX(_tidb_rowid) FROM tbl, and warn if the forced new auto_inc is smaller than the SELECT result.

@bb7133
Copy link
Member Author

bb7133 commented Jul 12, 2021

Maybe FORCE AUTO_INCREMENT (when SHARD_ROW_ID_BITS == 0) can do SELECT MAX(_tidb_rowid) FROM tbl, and warn if the forced new auto_inc is smaller than the SELECT result.

@kennytm Personally, I prefer to skip SELECT MAX... for the following reasons:

  • SELECT can be slow and expensive for a large table.
  • Since DML and DDL do not block each other in TiDB, the SELECT result may not be accurate.
  • This syntax is not recommended for normal cases, we will explicitly document its risk in our official website.

@kennytm
Copy link
Contributor

kennytm commented Jul 13, 2021

SELECT can be slow and expensive for a large table.

except for partitioned table the SELECT MAX(_tidb_rowid) is equivalent to finding the last key in the range ["t{tid}_r", "t{tid}_s") which should be O(~1).

(I'm ignoring the cost of walking one step in the LSM tree here, which at worst is logarithmic.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants