[CT-1183] [Feature] metrics should be configurable to show nulls for missing data after spining #5842

dave-connors-3 · 2022-09-14T19:02:07Z

Is this your first time submitting a feature request?

I have read the expectations for open source contributors
I have searched the existing issues, and I could not find an existing issue for this feature
I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

When using metrics in dbt, the metrics package will currently fill in any missing value after a metric has been date spined with the value 0. This has some implications for derived metrics that may divide by a zero value -- in discussing a fix for that particualr issue, and hearing some feedback from the community, we realized that we need a configurable, consistent way for using to decide the way that a metric is reported, either with null values or zeros.

A new optoinal metric attribute fill_missing_values would allow users to declare whether or not a metric returns 0 or null when there is missing data after datespining a metric.

Describe alternatives you've considered

Letting this be a option passed to the dbt_metrics.calculate() macro. This could be considered a display option rather than an attribute of a particular metric. However, @joellabes said it well:

I lean towards "it should be part of the metric spec". Allowing two different dashboards to use different processes to calculate a trailing average on the same metric feels counter to "same result, no matter what".

Who will this benefit?

metrics users new and old

Are you interested in contributing this feature?

yeah!

Anything else?

No response

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2022-09-15T06:51:34Z

Thanks @dave-connors-3!

Agree that this should always be consistent for all calculations of a given metric.

That said - this feels to me more like a standard configuration that could apply to several metrics, more so than an integral part of a metric's unique definition. Could we imagine adding other properties like this one, which don't define the metric per se, but configure how it should be executed (= calculated)?

Here's a thought / alternative approach:

We just added support for config on metrics in Add config to disable metrics/exposures #5815
config is already a dict that accepts arbitrary user-input values. If we want to add more on/off switches like this one in the future, we actually don't need a dbt-core PR to add a new property — if set, it just shows up in the metric node! We'd only need the PR in dbt-metrics, which can access the config as metric.get("config").get("fill_missing_values").
Of course, we'd still have the ability to PR dbt-core if we want to set an explicit default for that config (e.g. to plumb through the Metadata API), or to pass user input through custom validation logic at parse time.
If users want to override macros from the metrics package with custom implementations, which take custom configurations (like this one) as inputs — this would be a good pattern for doing it!

So in your dbt project code:

version: 2
metrics:
  - name: new_customers
    ...
    config:
      enabled: true
      fill_missing_values: true

Bonus: if this is a config, you could even set a default for all metrics in your project / a package:

# dbt_project.yml

metrics:
  my_project:
    +fill_missing_values: true

And somewhere in the dbt-metrics package (dbt-labs/dbt_metrics#96):

-- what should the default behavior be?
{% set fill_missing_values = dict_metric.get("config").get("fill_missing_values", False) %}
...
{% if fill_missing_values %}
    coalesce(...)
{% else %}
    ...
{% endif %}

@callum-mcdata @dbeatty10 Curious to get your thoughts here too. I see this as a potential pattern that could also enable fewer "migrations" to the metrics spec in the future.

Addendum: What's the difference between meta and config? They're a lot alike; both accept arbitrary key-value pairs. IMO, the distinction is twofold:

meta should be only be set by end users, and completely unused by dbt-core
meta keys should be metadata, totally optional decorators on a metric that provide additional context. They should not be tied to any functional changes in how a metric is actually calculated, how a model is materialized, etc — that's what config is for!

joellabes · 2022-09-15T07:23:54Z

config is already a dict that accepts arbitrary user-input values If we want to add more on/off switches like this one in the future, we actually don't need a dbt-core PR to add a new property — if set, it just shows up in the metric node!

it WHAT.

Today I learned!

leahwicz · 2022-09-15T17:25:53Z

Note: fill_missing_values might not be the most descriptive name to use. What is the missing value and fill with what?

dave-connors-3 · 2022-09-15T21:56:51Z

Totally in alignment with your thinking here @jtcohen6! This absolutely fits as a config to adjust the ergonomics of using a particular metric,more so than "an integral part of a metric's unique definition."

@leahwicz -- totally agree with your feedback here too! Do you have suggestions for better naming? was trying to balance brevity and clarity

dave-connors-3 · 2022-09-15T21:59:03Z

also I think we are clear to close this issue and #5843!

jtcohen6 · 2022-09-16T12:05:45Z

Closing as this can be handled solely within dbt-metrics for now

Agree with the naming suggestion! Verbose-r is better for something this subtle, and config: is nice because it can be set once (in dbt_project.yml) for all metrics

dave-connors-3 added enhancement New feature or request triage labels Sep 14, 2022

github-actions bot changed the title ~~[Feature] metrics should be configurable to show nulls for missing data after spining~~ [CT-1183] [Feature] metrics should be configurable to show nulls for missing data after spining Sep 14, 2022

dave-connors-3 mentioned this issue Sep 14, 2022

add fill_missing_values metric attribute #5843

Closed

6 tasks

dbeatty10 removed the triage label Sep 14, 2022

jtcohen6 added the metrics label Sep 15, 2022

leahwicz added the Refinement Maintainer input needed label Sep 15, 2022

jtcohen6 closed this as completed Sep 16, 2022

jtcohen6 removed the Refinement Maintainer input needed label Sep 16, 2022

jtcohen6 added the semantic Issues related to the semantic layer label Jun 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-1183] [Feature] metrics should be configurable to show nulls for missing data after spining #5842

[CT-1183] [Feature] metrics should be configurable to show nulls for missing data after spining #5842

dave-connors-3 commented Sep 14, 2022

jtcohen6 commented Sep 15, 2022 •

edited

Loading

joellabes commented Sep 15, 2022

leahwicz commented Sep 15, 2022

dave-connors-3 commented Sep 15, 2022

dave-connors-3 commented Sep 15, 2022

jtcohen6 commented Sep 16, 2022

[CT-1183] [Feature] metrics should be configurable to show nulls for missing data after spining #5842

[CT-1183] [Feature] metrics should be configurable to show nulls for missing data after spining #5842

Comments

dave-connors-3 commented Sep 14, 2022

Is this your first time submitting a feature request?

Describe the feature

Describe alternatives you've considered

Who will this benefit?

Are you interested in contributing this feature?

Anything else?

jtcohen6 commented Sep 15, 2022 • edited Loading

joellabes commented Sep 15, 2022

leahwicz commented Sep 15, 2022

dave-connors-3 commented Sep 15, 2022

dave-connors-3 commented Sep 15, 2022

jtcohen6 commented Sep 16, 2022

jtcohen6 commented Sep 15, 2022 •

edited

Loading