Provide a DM Upstream Simulator for Testing Incremental Replication from Upstream #4835

dsdashun · 2022-03-10T04:48:48Z

Is your feature request related to a problem?

Currently, when doing some tests for DM on incremental replication, we need to let the upstream continuously generate binlog streams. So we need a tool to continuously simulate the upstream workloads. The current solutions are not very convenient. (See the alternatives discussion below )

Describe the feature you'd like

The DM can provide a simulator, which has several features that can solve some problems mentioned above:

It can continuously apply meaningful modifications on upstream tables with the table schema provided.
It can define specific workload easily.
It can simulate batch DDL changes on several sharded tables with one click.
After the table schema change, the simulation will use the latest table structures.

Describe alternatives you've considered

Usually, we simulate upstream workload either by using some benchmark tools like sysbench, or by using some random SQL generating programs like sql-smith. However, this is not very convenient in some cases.

For those benchmark tools, the table schemas are pre-defined. If we need to provide a bunch of binlog stream from upstream clusters with specified table schemas, we need to modify the code.
For random sql-generator, the table schema can be defined by our own. However, when the generated SQL is executed on the upstream, usually no data is actually modified, because the filter clause is purely randomly generated. So it cannot provide a stable stream of binlogs from the upstream.
Some workload is hard to simulate from existing tools. For example, if we want to simulate a transaction with one insert of table A followed by updating several records on table B, and at last delete that inserted row, the existing tools can hardly do this, we need to write our own code to achieve this kind of simulation.
If the upstream clusters have several sharded tables, there is no way to batch apply the DDLs on the set of sharded tables with one command.
There is no way to simulate DMLs affecting the online table schema change. For example, we first let the upstream simulate binlog streams on the current table structure, then we do the DDL on the upstream to change the table structure. After that we expect the upstream to simulate DMLs using the latest table structures on the fly, without modifying any code.

Teachability, Documentation, Adoption, Migration Strategy

No response

ref #4835

dsdashun added area/dm Issues or PRs related to DM. type/feature Issues about a new feature labels Mar 10, 2022

dsdashun mentioned this issue Mar 10, 2022

simulator(dm): implement some core components #4838

Merged

ti-chi-bot pushed a commit that referenced this issue Apr 25, 2022

simulator(dm): implement some core components (#4838)

06f54d0

ref #4835

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a DM Upstream Simulator for Testing Incremental Replication from Upstream #4835

Provide a DM Upstream Simulator for Testing Incremental Replication from Upstream #4835

dsdashun commented Mar 10, 2022

Provide a DM Upstream Simulator for Testing Incremental Replication from Upstream #4835

Provide a DM Upstream Simulator for Testing Incremental Replication from Upstream #4835

Comments

dsdashun commented Mar 10, 2022

Is your feature request related to a problem?

Describe the feature you'd like

Describe alternatives you've considered

Teachability, Documentation, Adoption, Migration Strategy