Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for new DDL actions for remove and add table partitioning #7822

Merged
merged 15 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions dbms/src/TiDB/Schema/SchemaBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,22 @@ void SchemaBuilder<Getter, NameMapper>::applyDiff(const SchemaDiff & diff)
applyPartitionDiff(diff.schema_id, diff.table_id);
break;
}
case SchemaActionType::ActionAlterTablePartitioning:
case SchemaActionType::ActionRemovePartitioning:
{
if (diff.table_id == diff.old_table_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about that when diff.table_id = diff.old_table_id and when diff.table_id != diff.old_table_id, could you please give an example here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is depending on the stage change, only the last (StateDeleteReorganization -> StatePublic) is the table ID changed to the new one.

Example (added custom logs):
tidb.log

[2023/08/01 23:11:59.568 +02:00] [WARN] [ddl_worker.go:1431] [updateSchemaVersion] [category=ddl] [TableId=96] [SchemaState="delete only"]
[2023/08/01 23:11:59.637 +02:00] [WARN] [ddl_worker.go:1431] [updateSchemaVersion] [category=ddl] [TableId=96] [SchemaState="delete only"]
[2023/08/01 23:11:59.702 +02:00] [WARN] [ddl_worker.go:1431] [updateSchemaVersion] [category=ddl] [TableId=96] [SchemaState="write only"]
[2023/08/01 23:11:59.798 +02:00] [WARN] [ddl_worker.go:1431] [updateSchemaVersion] [category=ddl] [TableId=96] [SchemaState="write reorganization"]
[2023/08/01 23:11:59.865 +02:00] [WARN] [ddl_worker.go:1431] [updateSchemaVersion] [category=ddl] [TableId=96] [SchemaState="delete reorganization"]
[2023/08/01 23:11:59.865 +02:00] [WARN] [ddl_worker.go:1441] [StateDeleteReorganization] [category=ddl] [TableId=105] [OldTableID=96]

tiflash.log

[2023/08/01 23:11:59.818 +02:00] [WARN] [SchemaBuilder.cpp:238] ["ActionRemovePartitioning table_id 96 old 96"] [source="keyspace=4294967295"] [thread_id=60]
[2023/08/01 23:11:59.819 +02:00] [WARN] [SchemaBuilder.cpp:238] ["ActionRemovePartitioning table_id 96 old 96"] [source="keyspace=4294967295"] [thread_id=60]
[2023/08/01 23:11:59.820 +02:00] [WARN] [SchemaBuilder.cpp:238] ["ActionRemovePartitioning table_id 96 old 96"] [source="keyspace=4294967295"] [thread_id=60]
[2023/08/01 23:11:59.821 +02:00] [WARN] [SchemaBuilder.cpp:238] ["ActionRemovePartitioning table_id 96 old 96"] [source="keyspace=4294967295"] [thread_id=60]
[2023/08/01 23:12:55.086 +02:00] [WARN] [SchemaBuilder.cpp:238] ["ActionRemovePartitioning table_id 105 old 96"] [source="keyspace=4294967295"] [thread_id=61]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in the first three stages, does the partition.definitions changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the first (StateNone->StateDeleteOnly), it adds the new partitions to TableInfo.Partition.AddingDefinitions and the removed ones to TableInfo.Partition.DroppingDefinitions.

In the second (StateDeleteOnly->StateWriteOnly), it only waits for the partitions to be available in TiFlash (if the table should have any TiFlash replicas).

In the third (StateWriteOnly->StateWriteReorganization), no changes or checks.

In the forth (StateWriteReorganization->StateDeleteReorganization), no changes (but does all the data copying and recreating the indexes).

In the fifth (StateDeleteReorganization->StatePublic), it depends on the actual command (reorganize, remove partitioning or alter table partition by):

  • Reorganize, it will modify the TableInfo.Partition.Definitions by removing the DroppingDefinitions and add the AddingDefinitions and then reset AddingDefinitions and DroppingDefinitions.
  • Remove partitioning, it will change the tableID to the one used for AddingDefinitions. Really drop the old partitioned table metadata, change the TableInfo by setting the table id to the single AddingDefinitions table id, and remove TableInfo.Partition, and then create the table (metadata) with the updated TableInfo.
  • Alter Table Partition By, it will change the TableID and replace the TableInfo.Partition with the new partitioning type, expression and definitions. Also by first drop the table (metadata), change the TableInfo.Partition and table id, and the (re)create the table with the updated TableInfo.

Notice that the data cleanup is done later in an async way, and that the old table data also needs to be accessible during the schemaversion the DDL got during transitioning to StatePublic, since there may be clients in the schemaversion of StateDeleteReorganization that still need to read and (double) write the data.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some additional testing I figured out that I missed to drop the table (metadata) in TiFlash as well, not just create a new table :)

{
/// Only internal additions of new partitions
applyPartitionDiff(diff.schema_id, diff.table_id);
}
else
{
/// The new non-partitioned table will have a new id
applyDropTable(diff.schema_id, diff.old_table_id);
applyCreateTable(diff.schema_id, diff.table_id);
}
break;
}
case SchemaActionType::ExchangeTablePartition:
{
applyExchangeTablePartition(diff);
Expand Down
7 changes: 6 additions & 1 deletion dbms/src/TiDB/Schema/SchemaGetter.h
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,17 @@ enum class SchemaActionType : Int8
ActionReorganizePartition = 64,
ActionAlterTTLInfo = 65,
ActionAlterTTLRemove = 67,
ActionCreateResourceGroup = 68,
ActionAlterResourceGroup = 69,
ActionDropResourceGroup = 70,
ActionAlterTablePartitioning = 71,
ActionRemovePartitioning = 72,


// If we support new type from TiDB.
// MaxRecognizedType also needs to be changed.
// It should always be equal to the maximum supported type + 1
MaxRecognizedType = 68,
MaxRecognizedType = 73,
};

struct AffectedOption
Expand Down
194 changes: 194 additions & 0 deletions tests/fullstack-test2/ddl/alter_partition_by.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,194 @@
# Copyright 2023 PingCAP, Ltd.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


## partition_table --> partition_table

mysql> drop table if exists test.t, test.t2;
mysql> create table test.t (a int primary key, b varchar(255), c int, key (b), key (c,b)) partition by range (a) (partition p0 values less than (1000000), partition p1M values less than (2000000));
mysql> analyze table test.t;
mysql> alter table test.t set tiflash replica 1;

mysql> insert into test.t values (1,"1",-1);
mysql> insert into test.t select a+1,a+1,-(a+1) from test.t;
mysql> insert into test.t select a+2,a+2,-(a+2) from test.t;
mysql> insert into test.t select a+500000,a+500000,-(a+500000) from test.t;
mysql> insert into test.t select a+1000000,a+1000000,-(a+1000000) from test.t;

func> wait_table test t

# check table info in tiflash
>> select tidb_database,tidb_name from system.tables where tidb_database = 'test' and tidb_name = 't' and is_tombstone = 0
JaySon-Huang marked this conversation as resolved.
Show resolved Hide resolved
┌─tidb_database─┬─tidb_name─┐
│ test │ t │
└───────────────┴───────────┘

mysql> select /*+ READ_FROM_STORAGE(TIKV[t]) */ count(*) from test.t partition (p0);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> show warnings;
mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t]) */ count(*) from test.t partition (p0);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> show warnings;
mysql> select /*+ READ_FROM_STORAGE(TIKV[t]) */ count(*) from test.t partition (p1M);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t]) */ count(*) from test.t partition (p1M);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> show warnings;

mysql> alter table test.t partition by range (a) (partition p0 values less than (500000), partition p500k values less than (1000000), partition p1M values less than (2000000));

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t]) */ count(*) from test.t partition (p0);
+----------+
| count(*) |
+----------+
| 4 |
+----------+

mysql> show warnings;

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t]) */ count(*) from test.t partition (p500k);
+----------+
| count(*) |
+----------+
| 4 |
+----------+

mysql> show warnings;

mysql> select /*+ READ_FROM_STORAGE(TIKV[t]) */ count(*) from test.t partition (p0);
+----------+
| count(*) |
+----------+
| 4 |
+----------+

mysql> select /*+ READ_FROM_STORAGE(TIKV[t]) */ count(*) from test.t partition (p500k);
+----------+
| count(*) |
+----------+
| 4 |
+----------+

mysql> show warnings;

mysql> select /*+ READ_FROM_STORAGE(TIKV[t]) */ count(*) from test.t partition (p1M);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t]) */ count(*) from test.t partition (p1M);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

## non-partitioned table --> partitioned table
mysql> create table test.t2 (a int primary key, b varchar(255), c int, key (b), key (c,b));
mysql> alter table test.t2 set tiflash replica 1;
mysql> insert into test.t2 select * from test.t;
func> wait_table test t2

mysql> drop table test.t;

>> select tidb_database,tidb_name from system.tables where tidb_database = 'test' and tidb_name = 't2' and is_tombstone = 0
┌─tidb_database─┬─tidb_name─┐
│ test │ t2 │
└───────────────┴───────────┘

mysql> alter table test.t2 partition by hash (a) partitions 3;
mysql> analyze table test.t2;

mysql> explain format='brief' select /*+ READ_FROM_STORAGE(TIFLASH[t2]) */ count(*) from test.t2 partition (p0);
id estRows task access object operator info
StreamAgg 1.00 root funcs:count(Column#16)->Column#4
└─IndexReader 1.00 root partition:p0 index:StreamAgg
└─StreamAgg 1.00 cop[tikv] funcs:count(1)->Column#16
└─IndexFullScan 16.00 cop[tikv] table:t2, index:b(b) keep order:false

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t2]) */ count(*) from test.t2 partition (p0);
+----------+
| count(*) |
+----------+
| 5 |
+----------+

mysql> show warnings;

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t2]) */ count(*) from test.t2 partition (p1);
+----------+
| count(*) |
+----------+
| 6 |
+----------+

mysql> show warnings;

mysql> select /*+ READ_FROM_STORAGE(TIKV[t2]) */ count(*) from test.t2 partition (p0);
+----------+
| count(*) |
+----------+
| 5 |
+----------+

mysql> select /*+ READ_FROM_STORAGE(TIKV[t2]) */ count(*) from test.t2 partition (p1);
+----------+
| count(*) |
+----------+
| 6 |
+----------+

mysql> show warnings;


mysql> select /*+ READ_FROM_STORAGE(TIKV[t2]) */ count(*) from test.t2 partition (p2);
+----------+
| count(*) |
+----------+
| 5 |
+----------+

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t2]) */ count(*) from test.t2 partition (p2);
+----------+
| count(*) |
+----------+
| 5 |
+----------+

mysql> show warnings;

mysql> drop table test.t2;

89 changes: 89 additions & 0 deletions tests/fullstack-test2/ddl/remove_partitioning.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Copyright 2023 PingCAP, Ltd.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


## partition_table --> non-partitioned table

mysql> drop table if exists test.t;
mysql> create table test.t (a int primary key, b varchar(255), c int, key (b), key (c,b)) partition by range (a) (partition p0 values less than (1000000), partition p1M values less than (2000000));
mysql> analyze table test.t;
mysql> alter table test.t set tiflash replica 1;

mysql> insert into test.t values (1,"1",-1);
mysql> insert into test.t select a+1,a+1,-(a+1) from test.t;
mysql> insert into test.t select a+2,a+2,-(a+2) from test.t;
mysql> insert into test.t select a+500000,a+500000,-(a+500000) from test.t;
mysql> insert into test.t select a+1000000,a+1000000,-(a+1000000) from test.t;

func> wait_table test t

# check table info in tiflash
>> select tidb_database,tidb_name from system.tables where tidb_database = 'test' and tidb_name = 't' and is_tombstone = 0
┌─tidb_database─┬─tidb_name─┐
JaySon-Huang marked this conversation as resolved.
Show resolved Hide resolved
│ test │ t │
└───────────────┴───────────┘

mysql> select /*+ READ_FROM_STORAGE(TIKV[t]) */ count(*) from test.t partition (p0);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> show warnings;
mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t]) */ count(*) from test.t partition (p0);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> show warnings;
mysql> select /*+ READ_FROM_STORAGE(TIKV[t]) */ count(*) from test.t partition (p1M);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t]) */ count(*) from test.t partition (p1M);
+----------+
| count(*) |
+----------+
| 8 |
+----------+

mysql> show warnings;

mysql> alter table test.t remove partitioning;

mysql> select /*+ READ_FROM_STORAGE(TIFLASH[t]) */ count(*) from test.t;
+----------+
| count(*) |
+----------+
| 16 |
+----------+

mysql> show warnings;

mysql> select /*+ READ_FROM_STORAGE(TIKV[t]) */ count(*) from test.t;
+----------+
| count(*) |
+----------+
| 16 |
+----------+

mysql> show warnings;

mysql> drop table test.t;
12 changes: 9 additions & 3 deletions tests/run-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,15 @@ function get_elapse_s()
# Another way, the time part may start with 0, which means
# it will be regarded as oct format, use "10#" to ensure
# calculateing with decimal
if [ "$end_nanos" -lt "$start_nanos" ];then
end_s=$(( 10#$end_s - 1 ))
end_nanos=$(( 10#$end_nanos + 10**9 ))
if [ "$end_nanos" = "N" -a "N" = "$start_nanos" ];then
# MacOS does not support '%N' output_fmt in date...
end_nanos=0
start_nanos=0
else
if [ "$end_nanos" -lt "$start_nanos" ];then
end_s=$(( 10#$end_s - 1 ))
end_nanos=$(( 10#$end_nanos + 10**9 ))
fi
fi

elapse_s=$(( 10#$end_s - 10#$start_s )).`printf "%03d\n" $(( (10#$end_nanos - 10#$start_nanos)/10**6 ))`
Expand Down