Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

case insensitive column names for unit test YML #9467

Closed
2 tasks done
Tracked by #8283
graciegoheen opened this issue Jan 26, 2024 · 2 comments · Fixed by dbt-labs/dbt-adapters#55
Closed
2 tasks done
Tracked by #8283

case insensitive column names for unit test YML #9467

graciegoheen opened this issue Jan 26, 2024 · 2 comments · Fixed by dbt-labs/dbt-adapters#55
Assignees
Labels
bug Something isn't working case_sensitivity Issues related to dbt's case-sensitivity behavior High Severity bug with significant impact that should be resolved in a reasonable timeframe pre-release Bug not yet in a stable release unit tests Issues related to built-in dbt unit testing functionality
Milestone

Comments

@graciegoheen
Copy link
Contributor

graciegoheen commented Jan 26, 2024

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

For snowflake, this unit test definition results in a SQL parsing error:

unit_tests:
  - name: test_valid_email_address # this is the unique name of the test
    description: my favorite unit test
    model: dim_wizards # name of the model I'm unit testing
    given: # the mock data for your inputs
      - input: ref('stg_wizards')
        rows:
          - {wizard_id: "1", email: [email protected],     email_top_level_domain: example.com}
          - {wizard_id: "2", email: [email protected],     email_top_level_domain: unknown.com}
          - {wizard_id: "3", email: badgmail.com,         email_top_level_domain: gmail.com}
          - {wizard_id: "4", email: missingdot@gmailcom,  email_top_level_domain: gmail.com}
      - input: ref('top_level_email_domains')
        rows:
          - {tld: example.com}
          - {tld: gmail.com}
      - input: ref('stg_worlds')
        rows: []
    expect: # the expected output given the inputs above
      rows:
        - {wizard_id: "1", is_valid_email_address: true}
        - {wizard_id: "2", is_valid_email_address: false}
        - {wizard_id: "3", is_valid_email_address: false}
        - {wizard_id: "4", is_valid_email_address: false}
22:42:12 On unit_test.coalesce_ci_demo_2023.dim_wizards.test_valid_email_address: create or replace temporary table DEVELOPMENT.dbt_ggoheen.test_valid_email_address__dbt_tmp
         as
        (select * from (
        with  __dbt__cte__stg_wizards as (

-- Fixture for stg_wizards
select try_cast(null as NUMBER) AS WIZARD_ID, try_cast(null as VARCHAR) AS WIZARD_NAME, try_cast(null as VARCHAR) AS EMAIL, try_cast(null as VARCHAR) AS EMAIL_TOP_LEVEL_DOMAIN, try_cast(null as VARCHAR) AS PHONE_NUMBER, try_cast(null as NUMBER) AS WORLD_ID, 
    try_cast('1' as )
 AS wizard_id, 
    try_cast('[email protected]' as )
 AS email, 
    try_cast('example.com' as )
 AS email_top_level_domain
union all
select try_cast(null as NUMBER) AS WIZARD_ID, try_cast(null as VARCHAR) AS WIZARD_NAME, try_cast(null as VARCHAR) AS EMAIL, try_cast(null as VARCHAR) AS EMAIL_TOP_LEVEL_DOMAIN, try_cast(null as VARCHAR) AS PHONE_NUMBER, try_cast(null as NUMBER) AS WORLD_ID, 
    try_cast('2' as )
 AS wizard_id, 
    try_cast('[email protected]' as )
 AS email, 
    try_cast('unknown.com' as )
 AS email_top_level_domain
union all
select try_cast(null as NUMBER) AS WIZARD_ID, try_cast(null as VARCHAR) AS WIZARD_NAME, try_cast(null as VARCHAR) AS EMAIL, try_cast(null as VARCHAR) AS EMAIL_TOP_LEVEL_DOMAIN, try_cast(null as VARCHAR) AS PHONE_NUMBER, try_cast(null as NUMBER) AS WORLD_ID, 
    try_cast('3' as )
 AS wizard_id, 
    try_cast('badgmail.com' as )
 AS email, 
    try_cast('gmail.com' as )
 AS email_top_level_domain
union all
select try_cast(null as NUMBER) AS WIZARD_ID, try_cast(null as VARCHAR) AS WIZARD_NAME, try_cast(null as VARCHAR) AS EMAIL, try_cast(null as VARCHAR) AS EMAIL_TOP_LEVEL_DOMAIN, try_cast(null as VARCHAR) AS PHONE_NUMBER, try_cast(null as NUMBER) AS WORLD_ID, 
    try_cast('4' as )
 AS wizard_id, 
    try_cast('missingdot@gmailcom' as )
 AS email, 
    try_cast('gmail.com' as )
 AS email_top_level_domain
),  __dbt__cte__stg_worlds as (

-- Fixture for stg_worlds
select try_cast(null as NUMBER) AS WORLD_ID, try_cast(null as VARCHAR) AS WORLD_NAME
    limit 0
),  __dbt__cte__top_level_email_domains as (

-- Fixture for top_level_email_domains
select try_cast(null as VARCHAR) AS TLD, 
    try_cast('example.com' as )
 AS tld
union all
select try_cast(null as VARCHAR) AS TLD, 
    try_cast('gmail.com' as )
 AS tld
), wizards as (

    select * from __dbt__cte__stg_wizards

),

worlds as (

    select * from __dbt__cte__stg_worlds

),

accepted_email_domains as (

    select * from __dbt__cte__top_level_email_domains

),

check_valid_emails as (

    select  
        wizards.wizard_id,
        wizards.wizard_name,
        wizards.email,
        wizards.phone_number,
        wizards.world_id,

		coalesce (regexp_like(
            wizards.email, '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'
        )
        = true
        and accepted_email_domains.tld is not null,
        false) as is_valid_email_address

    from wizards
    left join accepted_email_domains
        on wizards.email_top_level_domain = lower(accepted_email_domains.tld)

)

select
    check_valid_emails.wizard_id,
    check_valid_emails.wizard_name,
    check_valid_emails.email,
    check_valid_emails.is_valid_email_address,
    check_valid_emails.phone_number,
    worlds.world_name
from check_valid_emails
left join worlds
    on check_valid_emails.world_id = worlds.world_id
    ) as __dbt_sbq
    where false
    limit 0

        );
22:42:12 Snowflake adapter: Snowflake query id: 01b1efd2-0804-0b33-000d-37832a1436ee
22:42:12 Snowflake adapter: Snowflake error: 001003 (42000): SQL compilation error:
syntax error line 8 at position 20 unexpected ')'.
syntax error line 10 at position 35 unexpected ')'.
syntax error line 12 at position 30 unexpected ')'.
syntax error line 16 at position 20 unexpected ')'.
syntax error line 18 at position 35 unexpected ')'.
syntax error line 20 at position 30 unexpected ')'.
syntax error line 24 at position 20 unexpected ')'.
syntax error line 26 at position 31 unexpected ')'.
syntax error line 28 at position 28 unexpected ')'.
syntax error line 32 at position 20 unexpected ')'.
syntax error line 34 at position 38 unexpected ')'.
syntax error line 36 at position 28 unexpected ')'.
syntax error line 47 at position 30 unexpected ')'.
syntax error line 51 at position 28 unexpected ')'.
22:42:12 Timing info for unit_test.coalesce_ci_demo_2023.dim_wizards.test_valid_email_address (execute): 22:42:11.707437 => 22:42:12.835004
22:42:12 On unit_test.coalesce_ci_demo_2023.dim_wizards.test_valid_email_address: Close
22:42:12 Runtime Error in unit_test test_valid_email_address (models/marts/unit_tests.yml)
  During unit test execution of dim_wizards::test_valid_email_address, dbt could not build the 'actual' result for comparison against 'expected' given the unit test definition:
   Database Error
    001003 (42000): SQL compilation error:
    syntax error line 8 at position 20 unexpected ')'.
    syntax error line 10 at position 35 unexpected ')'.
    syntax error line 12 at position 30 unexpected ')'.
    syntax error line 16 at position 20 unexpected ')'.
    syntax error line 18 at position 35 unexpected ')'.
    syntax error line 20 at position 30 unexpected ')'.
    syntax error line 24 at position 20 unexpected ')'.
    syntax error line 26 at position 31 unexpected ')'.
    syntax error line 28 at position 28 unexpected ')'.
    syntax error line 32 at position 20 unexpected ')'.
    syntax error line 34 at position 38 unexpected ')'.
    syntax error line 36 at position 28 unexpected ')'.
    syntax error line 47 at position 30 unexpected ')'.
    syntax error line 51 at position 28 unexpected ')'.
22:42:12 1 of 1 ERROR dim_wizards::test_valid_email_address ............................. [ERROR in 1.28s]
22:42:12 Finished running node unit_test.coalesce_ci_demo_2023.dim_wizards.test_valid_email_address

If I use all UPPERCASE for my field names, this error goes away.

Expected Behavior

I should be able to use lowercase filed names in my unit test definition

Which database adapter are you using with dbt?

snowflake

Additional Context

No response

@graciegoheen graciegoheen added bug Something isn't working High Severity bug with significant impact that should be resolved in a reasonable timeframe labels Jan 26, 2024
@MichelleArk
Copy link
Contributor

MichelleArk commented Jan 26, 2024

@jtcohen6 found this when we were doing some initial spiking for complex type support in snowflake, and I've started folding it into the complex type work here: #9102.

Given the severity of this issue, I'll split it out into a separate fix since it applies to all unit test execution in snowflake, regardless of complex type usage.

The fix for this will be in dbt-adapter in the unit test materialization macros, but should be tested generically across all 1p adapters. For example, a baseline adapter test that just checks for primative type support, similar to what we have in core here would have caught this and can be part of the acceptance criteria here.

@dbeatty10
Copy link
Contributor

@MichelleArk is the underlying root cause relating to dbt-snowflake's default column quoting?

And presumably the fix is to utilize adapter.quote in all the right places?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working case_sensitivity Issues related to dbt's case-sensitivity behavior High Severity bug with significant impact that should be resolved in a reasonable timeframe pre-release Bug not yet in a stable release unit tests Issues related to built-in dbt unit testing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants