[CT-3497] I want to add a description/label to each of the rows in my unit test to explicitly call out the edge cases I'm testing for #9283

graciegoheen · 2023-12-13T20:15:56Z

Is this your first time submitting a feature request?

I have read the expectations for open source contributors
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

When creating a unit test in my project:

unit_tests:
  - name: a # this is the unique name of the test
    model: dim_wizards # name of the model I'm unit testing
    given: # the mock data for your inputs
      - input: ref('stg_wizards')
        rows:
          - {wizard_id: 1, email: [email protected],     email_top_level_domain: example.com}
          - {wizard_id: 2, email: [email protected],     email_top_level_domain: unknown.com}
          - {wizard_id: 3, email: badgmail.com,         email_top_level_domain: gmail.com}
          - {wizard_id: 4, email: missingdot@gmailcom,  email_top_level_domain: gmail.com}
      - input: ref('top_level_email_domains')
        rows:
          - {tld: example.com}
          - {tld: gmail.com}
      - input: ref('stg_worlds')
        rows:
          - {world_id: 1}
    expect: # the expected output given the inputs above
      rows:
        - {wizard_id: 1, is_valid_email_address: true}
        - {wizard_id: 2, is_valid_email_address: false}
        - {wizard_id: 3, is_valid_email_address: false}
        - {wizard_id: 4, is_valid_email_address: false}

I want to optionally add descriptions/labels to each of my input rows to explain what each of the edge cases are. Something like:

      - input: ref('stg_wizards')
        rows:
          - {wizard_id: 1, email: [email protected],     email_top_level_domain: example.com}
             description: valid email
          - {wizard_id: 2, email: [email protected],     email_top_level_domain: unknown.com}
             description: incorrect email domain
          - {wizard_id: 3, email: badgmail.com,         email_top_level_domain: gmail.com}
             description: no @ symbol
          - {wizard_id: 4, email: missingdot@gmailcom,  email_top_level_domain: gmail.com}
             description: no period

More product/dx refinement needed on the spec. We should be able to add descriptions/labels regardless of which format: is used.

Describe alternatives you've considered

I could just put a large block of text in the description: field of the unit test.

Who will this benefit?

No response

Are you interested in contributing this feature?

No response

Anything else?

No response

The text was updated successfully, but these errors were encountered:

dbeatty10 · 2023-12-13T20:43:54Z

Good idea about describing each test case 🤩

Adding a description as additional sub-item of each row might be tricky.

With the features that currently exist, here's several different ways to describe individual test cases (none of which I actually tested to confirm if they work or not):

One description to rule them all
YAML comments
Individual unit tests

How do you feel about the pros/cons of each? (Can't say I had the most fun writing out 3. 😂)

One description to rule them all

unit_tests:
  - name: a # this is the unique name of the test
    description: |
      There are four test cases:
      1. valid email
      2. incorrect email domain
      3. no @ symbol
      4. no period
    model: dim_wizards # name of the model I'm unit testing
    given: # the mock data for your inputs
      - input: ref('stg_wizards')
        rows:
          ....

YAML comments

      - input: ref('stg_wizards')
        rows:
          # valid email
          - {wizard_id: 1, email: [email protected],     email_top_level_domain: example.com}
          # incorrect email domain
          - {wizard_id: 2, email: [email protected],     email_top_level_domain: unknown.com}
          # no @ symbol
          - {wizard_id: 3, email: badgmail.com,         email_top_level_domain: gmail.com}
          # no period
          - {wizard_id: 4, email: missingdot@gmailcom,  email_top_level_domain: gmail.com}

Individual unit tests

unit_tests:

  - name: a_valid_email
    description: valid email
    model: dim_wizards
    given:
      - input: ref('stg_wizards')
        rows:
          - {wizard_id: 1, email: [email protected],     email_top_level_domain: example.com}
      - input: ref('top_level_email_domains')
        rows:
          - {tld: example.com}
          - {tld: gmail.com}
      - input: ref('stg_worlds')
        rows:
          - {world_id: 1}
    expect:
      rows:
        - {wizard_id: 1, is_valid_email_address: true}

  - name: a_incorrect_email_domain
    description: incorrect email domain
    model: dim_wizards
    given:
      - input: ref('stg_wizards')
        rows:
          - {wizard_id: 2, email: [email protected],     email_top_level_domain: unknown.com}
      - input: ref('top_level_email_domains')
        rows:
          - {tld: example.com}
          - {tld: gmail.com}
      - input: ref('stg_worlds')
        rows:
          - {world_id: 1}
    expect:
      rows:
        - {wizard_id: 2, is_valid_email_address: false}

  - name: a_no_at_symbol
    description: no @ symbol
    model: dim_wizards
    given:
      - input: ref('stg_wizards')
        rows:
          - {wizard_id: 3, email: badgmail.com,         email_top_level_domain: gmail.com}
      - input: ref('top_level_email_domains')
        rows:
          - {tld: example.com}
          - {tld: gmail.com}
      - input: ref('stg_worlds')
        rows:
          - {world_id: 1}
    expect:
      rows:
        - {wizard_id: 3, is_valid_email_address: false}

  - name: a_no_period
    description: no period
    model: dim_wizards
    given:
      - input: ref('stg_wizards')
        rows:
          - {wizard_id: 4, email: missingdot@gmailcom,  email_top_level_domain: gmail.com}
      - input: ref('top_level_email_domains')
        rows:
          - {tld: example.com}
          - {tld: gmail.com}
      - input: ref('stg_worlds')
        rows:
          - {world_id: 1}
    expect:
      rows:
        - {wizard_id: 4, is_valid_email_address: false}

alison985 · 2024-02-01T01:32:09Z

FWIW, there would be value in printing the description of the test case in the test output to help with debugging. Individual unit tests aren't DRY. YAML comments wouldn't output when running the test.

Of the three above, I like description best. It may also be the easiest thing to add to test output. It also gives space for longer descriptions. It does mean whoever updates test cases has to remember to update the description though.

This isn't a great idea because it depends on implied order which again a test case updater would have to remember to update, but you could do:

unit_tests:
  - name: a # this is the unique name of the test
    model: dim_wizards # name of the model I'm unit testing
    given: # the mock data for your inputs
      - input: ref('stg_wizards')
        rows:
          - {wizard_id: 1, email: [email protected],     email_top_level_domain: example.com}
          - {wizard_id: 2, email: [email protected],     email_top_level_domain: unknown.com}
          - {wizard_id: 3, email: badgmail.com,         email_top_level_domain: gmail.com}
          - {wizard_id: 4, email: missingdot@gmailcom,  email_top_level_domain: gmail.com}
        description:
          - "valid email"
          - "incorrect email domain"
          - "no @ symbol"
          - "no period"
      - input: ref('top_level_email_domains')
        rows:
          - {tld: example.com}
          - {tld: gmail.com}
      - input: ref('stg_worlds')
        rows:
          - {world_id: 1}
    expect: # the expected output given the inputs above
      rows:
        - {wizard_id: 1, is_valid_email_address: true}
        - {wizard_id: 2, is_valid_email_address: false}
        - {wizard_id: 3, is_valid_email_address: false}
        - {wizard_id: 4, is_valid_email_address: false}

The following is probably slightly better from a developer user experience standpoint and an avoiding bugs based on implied order standpoint. However, it may be worse if it performs more queries or depending on how the last element here would have to flow. I have no knowledge of unit_tests outside of this thread to be able to guess.

    expect: # the expected output given the inputs above
      rows:
        - {wizard_id: 1, is_valid_email_address: true, 'valid email'}
        - {wizard_id: 2, is_valid_email_address: false, 'incorrect email domain'}
        - {wizard_id: 3, is_valid_email_address: false, 'no @ symbol'}
        - {wizard_id: 4, is_valid_email_address: false, 'no period'}

graciegoheen added enhancement New feature or request triage and removed triage labels Dec 13, 2023

graciegoheen mentioned this issue Dec 13, 2023

[CT-2911] [Epic] Unit testing dbt models #8283

Closed

dbeatty10 added the unit tests Issues related to built-in dbt unit testing functionality label Sep 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-3497] I want to add a description/label to each of the rows in my unit test to explicitly call out the edge cases I'm testing for #9283

[CT-3497] I want to add a description/label to each of the rows in my unit test to explicitly call out the edge cases I'm testing for #9283

graciegoheen commented Dec 13, 2023 •

edited

Loading

dbeatty10 commented Dec 13, 2023

alison985 commented Feb 1, 2024

[CT-3497] I want to add a description/label to each of the rows in my unit test to explicitly call out the edge cases I'm testing for #9283

[CT-3497] I want to add a description/label to each of the rows in my unit test to explicitly call out the edge cases I'm testing for #9283

Comments

graciegoheen commented Dec 13, 2023 • edited Loading

Is this your first time submitting a feature request?

Describe the feature

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

Anything else?

dbeatty10 commented Dec 13, 2023

One description to rule them all

YAML comments

Individual unit tests

alison985 commented Feb 1, 2024

graciegoheen commented Dec 13, 2023 •

edited

Loading