[CT-2273] [Feature] Allowing `ref` (or similar) for analysis files #7127

friendofasquid · 2023-03-06T03:00:16Z

Is this your first time submitting a feature request?

I have read the expectations for open source contributors
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Sometimes, we want to layer one analysis on top of another. In that case, it makes sense to allow a reference to a model file, essentially treating them like an ephemeral materialization.

Describe alternatives you've considered

Moving analysis to an ephemeral model. This works fine, but the intention is to run an analysis, not create a model.

Who will this benefit?

Anyone who uses analysis folder within dbt.

Are you interested in contributing this feature?

No

Anything else?

No response

dbeatty10 · 2023-03-06T14:56:57Z

Thanks for raising this idea @friendofasquid !

Follow-up question

Supposing it is possible to reference an analysis file like {{ analysis('my_analysis') }}, what might a simple concrete usage look like?

Another alternative

Added another idea to the bottom to your list of alternatives:

move the analysis code to an ephemeral model
move the analysis code to a macro

NumberPiOso · 2023-03-17T19:44:16Z

In reverse etl tools we want to use analysis (specific subsets of data) to be uploaded into different systems.

Making analysis "refeable" would allows us to document exposures easily.

model -> analysis -> exposure

dbeatty10 · 2023-03-20T19:31:43Z

As this issue points out, one big difference between ephemeral models and analyses is that the first is refable and the second is not.

Additional differences (that are not relevent for this discussion per se):

models can have tests whereas analyses can not
ephemeral models will be translated to a CTE (with statement) when it is ref'd within other models

Summary

After playing around a bit and researching the history of analyses and ephemeral models, it feels like they are relatively close to being isomorphic. So I'm attracted to the idea of effectively (or actually) promoting analyses to be an alias of the ephemeral materialization. Doing so would allow them to be referenced (as well as have tests defined).

Whether or not we ultimately go that direction, it appears to me like the desired functionality is already effectively possible today, and it isn't too hard to convert an existing project to get the behavior akin to the proposed feature.

If you haven't already done something similar, wanna give the instructions below a shot and provide feedback on what you do/don't like about it?

An approach that works today

@friendofasquid mentioned the alternative of moving analysis to an ephemeral model so it can be used in a ref. At least in the near-term, this feels like the way to go!

Playing around a bit, it appears relatively simple to upgrade all your analyses so they can be ref'd:

Move your analyses folder to be inside your models folder
Configure the default materialization of ephemeral for that folder within dbt_project.yml
Update the YAML configuration of analyses to be models instead

Differences

The most crucial differences are that the logic in these files can now be used in a ref and it can also have tests.

Two other differences that I'm seeing when converting analyses to refable analyses (aka ephemeral models):

when doing dbt compile, different output subfolder within the target directory
when doing dbt docs serve, models subfolder rather than analyses subfolder

Other differences are surely present within the manifest at target/manifest.json, but I didn't examine them.

Example

Thanks for your use-case @NumberPiOso !

I've got an example repo here with step-by-step instructions of converting everything within an analyses folder so that it can be used as a ref within an exposure.

The final result shows the exposure depending on my_even_ids which was formerly an analyis:

Details

Suppose you have a dbt_project.yml file like this:

name: "my_dbt_project"
version: "1.0.0"
config-version: 2
profile: "sandcastle-duckdb"

Then you add this to the end of dbt_project.yml:

models:
  my_dbt_project:
      refable_analyses:
        materialized: ephemeral

And finally you move & rename your analyses folder to models/refable_analyses.

Note: if you previously had an analyses/_analysis.yml file, you'll want to update analyses: to models: within the YAML once you've moved in into the models subdirectory.

Now each analysis can be referenced within an exposure like this:

models/_exposures.yml

version: 2

exposures:
  - name: my_even_dashboard
    description: My dashboard
    type: dashboard
    owner:
      name: Somebody Somewhere
      email: [email protected]
    
    depends_on:
      - ref('my_even_ids')"

Interested to hear your feedback!

NumberPiOso · 2023-03-20T22:04:48Z

Thanks for outlining the pros/cons of the alternative. After reviewing it some time I agree with this idea.

Thank you for taking the time to provide an overview of the pros and cons of the alternative. After considering your proposal, I agree that it would be beneficial for queries related to dbt to be explicitly included in a model, rather than an analysis. This will ensure that all transformations are clear and visible to everyone through documentation. Additionally, I concur that any model or query being used in a external tool, such as the reverse ETL tool, should be thoroughly tested.

The ephemeral materialization will prevent people from using it (which is something that I definitely like) but I think that including these models in the documentation will mislead users into thinking that they can use those them.

For my specific use case, I will try to document the models in a special way to prevent their usage instead (such as changing the colors in UI to look like an exposure instead of a model) or marking this with a special tag.

Thank you once again for your time and effort in presenting your proposal.

github-actions · 2023-06-19T01:59:57Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions · 2023-06-26T02:09:27Z

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

friendofasquid added enhancement New feature or request triage labels Mar 6, 2023

github-actions bot changed the title ~~[Feature] Allowing ref (or similar) for analysis files~~ [CT-2273] [Feature] Allowing ref (or similar) for analysis files Mar 6, 2023

dbeatty10 self-assigned this Mar 6, 2023

dbeatty10 added awaiting_response and removed triage labels Mar 6, 2023

dbeatty10 removed their assignment Mar 6, 2023

github-actions bot added triage and removed awaiting_response labels Mar 17, 2023

dbeatty10 self-assigned this Mar 18, 2023

dbeatty10 mentioned this issue Mar 18, 2023

ref an analysis in an exposure dbeatty10/dbt-sandcastles#2

Merged

dbeatty10 added awaiting_response and removed triage labels Mar 20, 2023

dbeatty10 removed their assignment Mar 20, 2023

github-actions bot added triage and removed awaiting_response labels Mar 20, 2023

dbeatty10 added awaiting_response and removed triage labels Mar 21, 2023

github-actions bot added the stale Issues that have gone stale label Jun 19, 2023

keurcien mentioned this issue Jun 21, 2023

Propagate dbt analyses to Metabase gouline/dbt-metabase#170

Closed

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 26, 2023

dbeatty10 mentioned this issue Nov 8, 2023

Update group config for analyses dbt-labs/docs.getdbt.com#4416

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-2273] [Feature] Allowing `ref` (or similar) for analysis files #7127

[CT-2273] [Feature] Allowing `ref` (or similar) for analysis files #7127

friendofasquid commented Mar 6, 2023

dbeatty10 commented Mar 6, 2023

NumberPiOso commented Mar 17, 2023

dbeatty10 commented Mar 20, 2023

NumberPiOso commented Mar 20, 2023

github-actions bot commented Jun 19, 2023

github-actions bot commented Jun 26, 2023

[CT-2273] [Feature] Allowing ref (or similar) for analysis files #7127

[CT-2273] [Feature] Allowing ref (or similar) for analysis files #7127

Comments

friendofasquid commented Mar 6, 2023

Is this your first time submitting a feature request?

Describe the feature

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

Anything else?

dbeatty10 commented Mar 6, 2023

Follow-up question

Another alternative

NumberPiOso commented Mar 17, 2023

dbeatty10 commented Mar 20, 2023

Summary

An approach that works today

Differences

Example

Details

NumberPiOso commented Mar 20, 2023

github-actions bot commented Jun 19, 2023

github-actions bot commented Jun 26, 2023

[CT-2273] [Feature] Allowing `ref` (or similar) for analysis files #7127

[CT-2273] [Feature] Allowing `ref` (or similar) for analysis files #7127