-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement type_boolean macro #5875
Conversation
@@ -59,7 +59,7 @@ The TIMESTAMP_* variation associated with TIMESTAMP is specified by the TIMESTAM | |||
{{ return(api.Column.translate_type("float")) }} | |||
{% endmacro %} | |||
|
|||
{# numeric ------------------------------------------------ #} | |||
{# numeric ------------------------------------------------- #} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
every other comment had this many dashes 🙃
@jpmmcneill this is looking good to me! 🤩 Here's the next steps for this PR:
I'll add some further instructions on your PR for dbt-bigquery on how to point at this PR branch for the purposes of automated CI tests in GitHub. (We'll want the adapters to be using your PR branch rather than the Once all of the PRs are passing the tests in GitHub Actions, we will merge dbt-core first. Then we will run CI once more for each adapter (pointing back at |
seeds__expected_csv = """boolean_col | ||
True | ||
""".lstrip() | ||
|
||
models__actual_sql = """ | ||
select cast('True' as {{ type_boolean() }}) as boolean_col | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to cover both boolean options at the very least.
seeds__expected_csv = """boolean_col | |
True | |
""".lstrip() | |
models__actual_sql = """ | |
select cast('True' as {{ type_boolean() }}) as boolean_col | |
""" | |
seeds__expected_csv = """boolean_col | |
True | |
False | |
""".lstrip() | |
models__actual_sql = """ | |
select cast('True' as {{ type_boolean() }}) as boolean_col | |
union all | |
select cast('False' as {{ type_boolean() }}) as boolean_col | |
""" |
Even better
But test cases covering the truth tables for conjunction, disjunction, and negation would be even better. That way, we are verifying that everything is acting like booleans.
Something like the following untested code:
seeds__boolean_permutations_csv = """
x,y
False,False
True,False
False,True
True,True
""".lstrip()
seeds__expected_csv = """
x,y,conjunction,disjunction,negation_x
False,False,False,False,True
True,False,False,True,False
False,True,False,True,True
True,True,True,True,False
""".lstrip()
models__actual_sql = """
select
x,
y,
x and y as conjunction,
x or y as disjunction,
not x as negation_x
from {{ ref("boolean_permutations" }}
Best? 🤷
Even though BOOLEAN
is a data type described in the SQL standard, some databases don't have it (looking at you, SQL Server!).
If we are feeling extra magnanimous, we could change all the True/False values in the seeds to be 1
/0
instead.
I'm hoping it wouldn't be necessary, but we could update the models__actual_sql
definition so that x/y values are replaced with the following instead:
cast(x as {{ type_boolean() }})
cast(y as {{ type_boolean() }})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update:
The "Even better" example I gave above didn't actually test the type_boolean()
macro at all! 😅
Something like this should fix that situation:
models__actual_sql = """
select
cast(x as {{ type_boolean() }}) as x_bool,
cast(y as {{ type_boolean() }}) as y_bool,
cast(x as {{ type_boolean() }}) and cast(y as {{ type_boolean() }}) as conjunction,
cast(x as {{ type_boolean() }}) or cast(y as {{ type_boolean() }}) as disjunction,
not cast(x as {{ type_boolean() }}) as negation_x
from {{ ref("boolean_permutations" }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cheers @dbeatty10. I'll take a second pass at this later today :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpmmcneill I just realized that your simple pytest for type_boolean()
was exactly in line with the rest of the type_x
macros.
So I'm approving as-is.
After this PR is merged, then the next steps will be to restore the original dev-requirements.txt files in each of your adapter PRs.
Hey @dbeatty10 - thanks. I personally completely agree with the sentiment around the current level of testing :). Do you agree with me that an issue that basically scopes "improve the current test coverage for data types" would be welcome? |
Formally logging where expectations differs from reality is an important mechanism for us. A new issue describing the current level of testing and how that compares to what your expectations were as a contributor (here) and dbt package maintainer (here) would be great! From there, someone from dbt Labs (maybe me!) will triage the issue submission and give feedback, determine urgency, compare to current roadmap and capacity, etc. |
Brill, will do. Thanks @dbeatty10 🐐 |
resolves (partially) #5739
Description
dbt-core component required for #5739.
Other PRs
Checklist
changie new
to create a changelog entry