-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-2670] [Bug] Undocumented contract behavior/feature causing unexpected results #7824
Comments
Thank you for opening this thorough issue @ttusing! Agreed, the documentation under Model Contracts is lacking any mention of the breaking change detection functionality. We have documented in the reference documentation for the contract config but should highlight it where we explain the model contract conceptually since it's a big piece of the overall motivation. I've opened an issue for that here: dbt-labs/docs.getdbt.com#3496
Possible Workaround(s) With what's currently available in dbt, a safer workaround would be to:
Is there a better way? To streamline this workflow, we could consider a new configuration on the In general I do think something like
Precision and scale aren't part of the contract -- for data types that have precision (and scale), just the high-level data type is part of the contract. There is some documentation on that in the reference docs:
|
Thanks! This is really helpful. One feature idea to help this workflow that might be more in-line and I think I've seen discussed elsewhere/might already be true: when using model versions, have the latest model version materialize in the database as the usual model name instead of having a model version suffix. In this case, we could increment the model version with changes. With regards to precision/scale, when using For some product feedback reference, we are looking at using contracts actually for our staging models. The primary motivation is that we'd like to ensure that input into our project has stable datatypes and enforce that the current columns in model YML match the SQL/database. Think we are going to pause on implementing this since it is taking a bit more effort than expected/budgeted. I am definitely going to keep an eye on this and contribute as I have time ✨ . Contracts are a very powerful feature. |
Great conversation so far :)
@MichelleArk I agree! @ttusing In your original post, you said:
Given your most recent reply, is it fair to say that you'd be more willing to use model versioning if we made it easier (automatic) to always have the latest version deployed into the unversioned/unsuffixed namespace? (That's this issue: #7442. There is a documented workaround in the meantime that achieves the same behavior using a custom macro +
I take your point to be: If I specify models:
- name: my_model
config:
contract:
enforced: true
columns:
- name: my_decimal_column
data_type: integer select 1.1 as my_decimal_column After quickly looking into this: Snowflake's cursor reports back both data types as ipdb> sql
'select * from (\n select 1.1 as my_decimal_column\n ) as __dbt_sbq\n where false\n limit 0\n'
ipdb> cursor.description
[ResultMetadata(name='MY_DECIMAL_COLUMN', type_code=0, display_size=None, internal_size=None, precision=2, scale=1, is_nullable=False)]
ipdb> columns
[SnowflakeColumn(column='MY_DECIMAL_COLUMN', dtype='FIXED', char_size=None, numeric_precision=None, numeric_scale=None)]
...
ipdb> sql
'select * from (\n select\n cast(null as integer) as my_decimal_column\n ) as __dbt_sbq\n where false\n limit 0\n'
ipdb> cursor.description
[ResultMetadata(name='MY_DECIMAL_COLUMN', type_code=0, display_size=None, internal_size=None, precision=38, scale=0, is_nullable=True)]
ipdb> columns
[SnowflakeColumn(column='MY_DECIMAL_COLUMN', dtype='FIXED', char_size=None, numeric_precision=None, numeric_scale=None)] Is there some way for us to reliably detect that difference, without splitting other hairs? By contrast, other data platforms give us back genuinely different data type codes/names — e.g. |
Yeah, we want to keep table names stable. We also have tooling that assumes that DBT model names match Snowflake table names.
I looked into this a little more. I think it would be useful to update documentation to explain this behavior a bit more. When specifying the Snowflake datatype for numerics (and I suppose varchars), you can specify this in the contract using the expected database syntax, i.e. tldr: is specifying precision and scale in this datatype property like this a supported feature? If so, I think updating documentation on how to use it and the consequences of not specifying it would be helpful to users - it was not entirely obvious to me that DBT would cast my numeric to the default Snowflake numeric precision and scale when I specified that the field was |
I opened #8028 without realising that this one already existed. Worth noting that there's an easier workaround from jerco here: #8028 (comment) In short, you can bump the version but not use any other of the versioning constructs. To quote the issue:
|
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days. |
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers. |
Is this a new bug in dbt-core?
Current Behavior
I am submitting this as a bug and looking for advice on where to follow up.
There seems to be an undocumented feature when using contracts. When deferring to a previous run, changing contracted datatypes or disabling a contract throws an error and requires the user to use model versioning.
In my workflow, we have DBT cloud running CI and needed to change a datatype. We got the following error unexpectedly:
This feature is not described anywhere. In order to address this, we turned off deferring to previous runs and are disabling our implemented contracts until behavior better matches documentation and we can decide if the feature is appropriate for us.
We don't want to use model versioning and ideally this specific behavior of changing contracts throwing errors can be configured to be turned off in project config. I am unsure what a good workflow would be for using contracts that need to change occasionally in a traditional DBT cloud CI setup.
I also believe more documentation could include guidance on adding precision and scale for Snowflake datatypes.
Expected Behavior
Enforcement steps of contracts matches the documentation page. Perhaps adding to the section "DBT will do these things differently".
https://docs.getdbt.com/docs/collaborate/govern/model-contracts
Steps To Reproduce
Relevant log output
No response
Environment
Which database adapter are you using with dbt?
snowflake
Additional Context
No response
The text was updated successfully, but these errors were encountered: