-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework sources to be dbt models rather than manually created #188
Conversation
models/sources/sources.sql
Outdated
cast(null as {{ type_string() }}) as name, | ||
cast(null as {{ type_string() }}) as identifier, | ||
cast(null as {{ type_string() }}) as loaded_at_field, | ||
cast(null as {{ type_json() }}) as freshness |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an ARRAY type before for Snowflake which is currently leading to errors when I do:
dbt run -m dbt_artifacts
when I install this branch into a project which already is using dbt_artifacts:
10:11:48 Database Error in model sources (models/sources/sources.sql)
10:11:48 002023 (22000): SQL compilation error:
10:11:48 Expression type does not match column data type, expecting ARRAY but got OBJECT for column FRESHNESS
10:11:48 compiled SQL at target/run/dbt_artifacts/models/sources/sources.sql`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this will be tricky to solve in a way that's compatible to both Snowflake and Bigquery I think, given that one uses JSON and the other ARRAY
dbt_artifacts/macros/create_sources_table_if_not_exists.sql
Lines 36 to 66 in 9146d71
{% macro snowflake__get_create_sources_table_if_not_exists_statement(database_name, schema_name, table_name) -%} | |
create table {{database_name}}.{{schema_name}}.{{table_name}} ( | |
command_invocation_id STRING, | |
node_id STRING, | |
run_started_at TIMESTAMP_TZ, | |
database STRING, | |
schema STRING, | |
source_name STRING, | |
loader STRING, | |
name STRING, | |
identifier STRING, | |
loaded_at_field STRING, | |
freshness ARRAY | |
) | |
{%- endmacro %} | |
{% macro bigquery__get_create_sources_table_if_not_exists_statement(database_name, schema_name, table_name) -%} | |
create table {{database_name}}.{{schema_name}}.{{table_name}} ( | |
command_invocation_id STRING, | |
node_id STRING, | |
run_started_at TIMESTAMP, | |
database STRING, | |
schema STRING, | |
source_name STRING, | |
loader STRING, | |
name STRING, | |
identifier STRING, | |
loaded_at_field STRING, | |
freshness JSON | |
) | |
{%- endmacro %} |
One option would be to create a specific type helper for just this column, but that seems a bit suboptimal IMO. Do you have any ideas @NiallRees?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could offer a one-time migration macro for the source tables and cut a new major version with this new method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think I'd be inclined to use a specific type helper (just doing an if snowflake else in the model vs a macro) rather than adding more complexity to the migration process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NiallRees done and ready for another test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested successfully on Snowflake, BigQuery and Databricks when tables already existed.
Closes #153, #190 and unblocks #175, #177
This PR replaces our manual schema generation macros with incremental models with
on_schema_change
configuration designed to make schema evolution more painfree.Of note, it bumps the
require-dbt-version
to >= 1.2.0.Additionally this removes the dependency on dbt_utils by using a combination of the new cross DB macros in core >=1.2.0 and a copy/paste of the surrogate_key macro.