You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BigQuery supports arrays as REPEATED fields
Running generate_model_yaml for a model with a repeated field of a certain datatype will result in a
schema with data_type: datatype rather than data_type: array<datatype>
Running the model again with contract.enforced=true will show the error:
{%macrodata_type_format_model(column) -%}
{{ return(adapter.dispatch('data_type_format_model', 'codegen')(column)) }}
{%- endmacro%}{# format a column data type for a model #}{%macrodefault__data_type_format_model(column) %}{%setformatted = codegen.format_column(column) %}
{{ return(formatted['data_type'] | lower) }}
{%endmacro%}
format_column is vendored in macros/vendored/dbt_core/format_column.sql
But it doesn't handle the specific case for repeated fields.
I am tempted to create a default__format_column and a bigquery__format_column to handle BigQuery specifically:
and moving it all into helpers.sql while removing the vendored format_column
Alternatively, bigquery__format_column could return a 'mode': column.mode field
and the adapter specific datatype conversion would be taken care of in bigquery__data_type_format_model and bigquery__data_type_format_source
Are you interested in contributing the fix?
Yes. I would like some input on whether the proposed solution in Additional context is reasonable.
The text was updated successfully, but these errors were encountered:
Describe the bug
BigQuery supports arrays as REPEATED fields
Running generate_model_yaml for a model with a repeated field of a certain datatype will result in a
schema with
data_type: datatype
rather thandata_type: array<datatype>
Running the model again with
contract.enforced=true
will show the error:Repeated records should have
data_type: array
Steps to reproduce
create a model_with_repeated_field.sql:
run it
Expected results
Actual results
Screenshots and log output
Running the model with the following yml:
System information
Which database are you using dbt with?
The output of
dbt --version
:The operating system you're using:
MacOS Sequoia Version 15.1
The output of
python --version
:Python 3.10.15
Additional context
generate_model_yaml gets the data_type using
data_type_format_model
which in turn calls codegen.format_column
format_column is vendored in macros/vendored/dbt_core/format_column.sql
But it doesn't handle the specific case for repeated fields.
I am tempted to create a
default__format_column
and abigquery__format_column
to handle BigQuery specifically:and moving it all into helpers.sql while removing the vendored format_column
Alternatively, bigquery__format_column could return a
'mode': column.mode
fieldand the adapter specific datatype conversion would be taken care of in
bigquery__data_type_format_model
andbigquery__data_type_format_source
Are you interested in contributing the fix?
Yes. I would like some input on whether the proposed solution in Additional context is reasonable.
The text was updated successfully, but these errors were encountered: