Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3529] [Unit Testing] Unit Testing Versioned Models #9344

Closed
1 task done
Tracked by #8283
emmyoop opened this issue Jan 5, 2024 · 2 comments · Fixed by #9421
Closed
1 task done
Tracked by #8283

[CT-3529] [Unit Testing] Unit Testing Versioned Models #9344

emmyoop opened this issue Jan 5, 2024 · 2 comments · Fixed by #9421
Assignees
Labels
user docs [docs.getdbt.com] Needs better documentation

Comments

@emmyoop
Copy link
Member

emmyoop commented Jan 5, 2024

Housekeeping

  • I am a maintainer of dbt-core

Short description

This is the outcome of the spike+ #8799. Exact details of what input should look like can be found there.

Outcomes of the spike

  1. We need to patch unit-tests to be able to determine what versions of a model exist since versions are defined in the schema files and schema files are parsed at the end. This is largely done in spike unit test versions #9302.
  2. There will still only be one unit test node, even though we may be executing multiple unit tests. The versioned models that the unit test can run against will be listed in the depends_on of the UnitTestDefinition
  3. We may need to do something similar to what @gshank did in the build command here, where there are two 'selected' lists, one with unit tests and one without, but we would need two selection lists, one with models and one without (in order to account for run results)

Acceptance criteria

  1. A unit test definition can define the model versions to include or exclude from a test
  2. If no versions are defined in the unit test definition, but the target model is versioned, a unit test will be run for all versions of the model
  3. When dbt build is run with a select for a versioned model, only the unit test for that specific model version will run, even if no version is defined in the schema file
  4. When a command selects on a unit test that is for a versioned model, unit tests for all versions of that model will be run

Impact to Other Teams

no

Will backports be required?

no

Context

Suggested Tests

  1. test with no version specified, should create a separate unit test for each version
  2. with with an exclude version specified, should create a separate unit test for each version except the excluded version
  3. test with an include version specified, should create a single unit test for only the version specified
  4. test with an include and exclude version specified, should get ValidationError
  5. test with an include for an unversioned model, should error
  6. partial parsing test: test with no version specified, then add an exclude version, then switch to include version and make sure the right unit tests are generated for each
  7. test with no version specified in the schema file and use selection logic on a versioned model for a specific version
  8. test with no version specified in the schema file and use selection logic on a unit test - expect unit tests for all versioned models
  9. test specifying the fixture version with {{ ref(name, version) }}
@emmyoop emmyoop added the user docs [docs.getdbt.com] Needs better documentation label Jan 5, 2024
@github-actions github-actions bot changed the title [Unit Testing] Unit Testing Versioned Models [CT-3529] [Unit Testing] Unit Testing Versioned Models Jan 5, 2024
@graciegoheen
Copy link
Contributor

This would create some known :( funky behavior:

  • dbt retry (and only one version of the tests had failed, all would be re-tried)
  • if you fail on any version, we'd block on all versions

Are we ok with that?

Alt.

  • unit test only applies to a single version
  • if you don't supply a version, applies to latest
  • otherwise you must supply an explict version
  • not as DRY
  • not automatic to catch breaking changes to unit testing logic when creating a new version

@graciegoheen
Copy link
Contributor

After discussing internally, we've decided we are not ok with this funky behavior.

We are going to try again with making one node per unit test run (instead of bundling them together). This is consistent with how we treat data tests that are configured on a model with multiple versions.

Example: a uniqueness test with a versioned model.

models:
  - name: my_model
    columns:
      - name: id
        tests:
          - unique
    versions:    
      - v: 1
      - v: 2

Command:

dbt list -s my_model

Output:

20:28:35  Running with dbt=1.7.4
20:28:36  Registered adapter: duckdb=1.7.0
20:28:36  Found 3 models, 1 snapshot, 1 analysis, 1 seed, 2 tests, 1 source, 0 exposures, 1 metric, 391 macros, 0 groups, 1 semantic model
my_project.my_model.v1
my_project.my_model.v2
my_project.unique_my_model_v1_id
my_project.unique_my_model_v2_id

Note where it says “2 tests” and that it shows those 2 tests.

If we are unable to overcome the partial parsing issues with the above solution, we will have a known restriction that you can only apply 1 unit test per model version. We will then not allow folks to specify multiple versions of a model that unit test to apply to. If no version is specified, we will use the latest version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user docs [docs.getdbt.com] Needs better documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants