You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'd like to keep my DDL in dbt, and use dbt conveniences (loops) to create tables. But, I also want this to integrate nicely with the dbt flow (particularly with materialised tables, etc).
One example problem I've faced a number of times: I use AWS's Data Migration Service (DMS) to replicate data from PostgreSQL into Redshift. DMS doesn't create particularly good tables by default, so usually you'd create them before starting the DMS task. With sharded tables and databases, that can mean a lot of tables! It'd be great to define a table creation "template" in dbt, and then use it in an arbitrary Jinja/Python loop.
Weird idea (note, coming from a dbt beginner):
project/tables/history.sql =
create table {{ schema }}.{{ table }} (
id integerprimary key,
title varchar(100) not null encode zstd,
datetimestamp with time zonenot null encode zstd
);
Then in a sensible place for table creation, I'd make a little loop to create tables with the history template with names like, history_1, history_2 (two shards).
Perhaps for materialisations, one of the config parameters might be which table template to use?
Who will this benefit?
Redshift users (perhaps other backends if they support similar extensions in DDL) who want to create tables appropriately optimised for their data.
hey @jdub - I think "source" table creation falls under the auspices of the E and L in ELT, and I'm not certain that creating source tables does fit into the dbt flow! dbt is built around repeatability and every dbt user can build the whole project from scratch. I'm not sure how that notion would translate into one-time operations like building source tables.
So, what I can imagine is dbt just running arbitrary jinja-sql scripts for you with some sort of context (think: vars, env_vars, a profile, etc). We're tracking that kind of thing over here: #582
I'm very ok with the idea of folks using dbt to do exactly what you're describing, but I'd hesitate to make it a first-class part of the tool. Lmk what you think about that!
Feature description
I'd like to keep my DDL in dbt, and use dbt conveniences (loops) to create tables. But, I also want this to integrate nicely with the dbt flow (particularly with materialised tables, etc).
One example problem I've faced a number of times: I use AWS's Data Migration Service (DMS) to replicate data from PostgreSQL into Redshift. DMS doesn't create particularly good tables by default, so usually you'd create them before starting the DMS task. With sharded tables and databases, that can mean a lot of tables! It'd be great to define a table creation "template" in dbt, and then use it in an arbitrary Jinja/Python loop.
Weird idea (note, coming from a dbt beginner):
project/tables/history.sql
=Then in a sensible place for table creation, I'd make a little loop to create tables with the history template with names like,
history_1
,history_2
(two shards).Perhaps for materialisations, one of the config parameters might be which table template to use?
Who will this benefit?
Redshift users (perhaps other backends if they support similar extensions in DDL) who want to create tables appropriately optimised for their data.
This might be related to #901 and #1016.
The text was updated successfully, but these errors were encountered: