distributed_table
materialization doesn't guranatee idempotency (in respect of table structure)
#333
Labels
bug
Something isn't working
Describe the bug and step to reproduce:
In dbt, default behavior of
table
materialization is to drop and recreate the table. If I run the same dbt model twice, the second run will drop the table and recreate it.Unfortunately, this is not the case for
distributed_table
materialization. Here is how I tested it:1. Model sql file
2. First dbt run
Even thougth the model query above is simple, there are many queries run in the background. I attached some impiortant steps happend in background:
3. Result of the first run
Result of
SHOW CREATE TABLE dp_rankingdb_dbt_dev.jw_local
is what I expected and meant to be:Of course, Result of
SHOW CREATE TABLE dp_rankingdb_dbt_dev.jw
is also what I expected as well.4. Second dbt run
In this time, queries executed in background a little bit different:
It run successfully but the result of
SHOW CREATE TABLE dp_rankingdb_dbt_dev.jw_local
is not what I expected:As you can see here, the path of the table is changed to
dp_rankingdb_dbt_dev.jw_local__dbt_backup
!!And I found out that this happend because of the
EXCHANGE TABLES ..
query above.Note: If you manually run(not via dbt run)
SHOW CREATE TABLE dp_rankingdb_dbt_dev.jw_local__dbt_backup
beforedrop table if exists dp_rankingdb_dbt_dev.jw_local__dbt_backup ON CLUSTER "dp" SYNC
,you will see that the ReplicatedReplacingMergeTree path will be
jw_local
instead ofjw_local__dbt_backup
. They are literally the EXCHANGED!You might ask, so what?
... this lead to a problem in the next run(3rd dbt run) because
dbt run
will invoke below query again in next run:and it raised error:
This happend because
drop table if exists dp_rankingdb_dbt_dev.jw_local__dbt_backup ON CLUSTER "dp" SYNC
deletejw_local
instead ofjw_local__dbt_backup
(because they are interchanged!) so path ofjw_local__dbt_backup
still exists when dbt try to createdp_rankingdb_dbt_dev.jw_local__dbt_backup
again...I checked the codes in
dbt/include/clickhouse/macros/materializations/distributed_table.sql
and found out that this is somewhat related with backup table creation,but I think this kind of behavior is not well aligned with the default behavior of materialization in dbt..
Configuration
Environment
ClickHouse server
CREATE TABLE
statements for tables involved:The text was updated successfully, but these errors were encountered: