Custom table DDL integrated with the dbt flow #1017

jdub · 2018-09-20T02:06:05Z

Feature description

I'd like to keep my DDL in dbt, and use dbt conveniences (loops) to create tables. But, I also want this to integrate nicely with the dbt flow (particularly with materialised tables, etc).

One example problem I've faced a number of times: I use AWS's Data Migration Service (DMS) to replicate data from PostgreSQL into Redshift. DMS doesn't create particularly good tables by default, so usually you'd create them before starting the DMS task. With sharded tables and databases, that can mean a lot of tables! It'd be great to define a table creation "template" in dbt, and then use it in an arbitrary Jinja/Python loop.

Weird idea (note, coming from a dbt beginner):

project/tables/history.sql =

create table {{ schema }}.{{ table }} (
  id integer primary key,
  title varchar(100) not null encode zstd,
  date timestamp with time zone not null encode zstd
);

Then in a sensible place for table creation, I'd make a little loop to create tables with the history template with names like, history_1, history_2 (two shards).

Perhaps for materialisations, one of the config parameters might be which table template to use?

Who will this benefit?

Redshift users (perhaps other backends if they support similar extensions in DDL) who want to create tables appropriately optimised for their data.

This might be related to #901 and #1016.

The text was updated successfully, but these errors were encountered:

drewbanin · 2018-09-20T13:27:03Z

hey @jdub - I think "source" table creation falls under the auspices of the E and L in ELT, and I'm not certain that creating source tables does fit into the dbt flow! dbt is built around repeatability and every dbt user can build the whole project from scratch. I'm not sure how that notion would translate into one-time operations like building source tables.

So, what I can imagine is dbt just running arbitrary jinja-sql scripts for you with some sort of context (think: vars, env_vars, a profile, etc). We're tracking that kind of thing over here: #582

I'm very ok with the idea of folks using dbt to do exactly what you're describing, but I'd hesitate to make it a first-class part of the tool. Lmk what you think about that!

jdub · 2018-09-21T10:09:21Z

Yeah, I think the operation concept is a good fit – I'm going to close this ticket in favour of it.

jdub closed this as completed Sep 21, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom table DDL integrated with the dbt flow #1017

Custom table DDL integrated with the dbt flow #1017

jdub commented Sep 20, 2018

drewbanin commented Sep 20, 2018

jdub commented Sep 21, 2018

Custom table DDL integrated with the dbt flow #1017

Custom table DDL integrated with the dbt flow #1017

Comments

jdub commented Sep 20, 2018

Feature description

Who will this benefit?

drewbanin commented Sep 20, 2018

jdub commented Sep 21, 2018