Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specifying a timezone #56

Closed
prratek opened this issue Jul 18, 2022 · 11 comments
Closed

Allow specifying a timezone #56

prratek opened this issue Jul 18, 2022 · 11 comments
Labels
enhancement New feature or request good first issue Good for newcomers triage

Comments

@prratek
Copy link

prratek commented Jul 18, 2022

Since calculating a metric typically involves truncating a timestamp to day/week/month, it would be great to be able to configure what timezone to use, maybe as an additional optional argument to metrics.calculate(...). I'm not sure whether this makes more sense as something to support in the dbt metric definition instead since people likely want to enforce use of a certain timezone for a metric calculation so open to your thoughts there.

@callum-mcdata callum-mcdata added the enhancement New feature or request label Jul 18, 2022
@callum-mcdata
Copy link
Contributor

@prratek just getting around to this, sorry about the delay! I'd love to better understand your use case here.

My major concern is that timezone's being a part of the macro could break one of the key tenets of the metrics package:

  • A metric value should be consistent everywhere that it is referenced

If Person A over in accounting is interested in ARR by their local timezone and Person B in finance is interested in ARR by UTC, they'll get different answers and we fall back into the same world that we existed in beforehand.

Here are a few thoughts:

  • Everything as UTC. We take a hardline stance that metrics can only exist in UTC to ensure consistency. Obviously that is a bit ... hamfisted.
  • The timezone of the table. We abstract timezones away from metrics and leave that to the person creating the model.

The latter is how we're doing it right now but I believe that it is slightly unintentional. We could explicitly callout that timezones are the responsibility of the model writer/maintainer.

Caveat: This could be complicated with custom calendars. If your custom calendar is in a timezone and your model is in another timezone, the joins could be a bit funky.

@prratek
Copy link
Author

prratek commented Aug 3, 2022

@callum-mcdata I'm solving for the use case where I have a table where my temporal column is a UTC timestamp but I'd like to enforce that when a metric is calculated by day/week/month that it be done in Eastern time. I hear you that this likely isn't something we'd want to leave up to the user querying for the metric - we're not trying to provide timezone flexibility to the end user. With that concern in mind, maybe it would make sense to be able to configure a timezone in the metric definition itself, and dbt-metrics could then read that to perform any timezone casting necessary when aggregating.

@callum-mcdata
Copy link
Contributor

@prratek following up on this after a few weeks with my head down. Directionally I am aligned on this idea - I think it makes sense to have this be an optional attribute at the metric level. Unfortunately that means we'd need to go into the belly of the beast (dbt-core) to add that attribute. This is something I'm planning on doing for #83 so I could pair that later on if we get buy-in.

More importantly, are you interested in exploring adding this logic through the use of meta flags to test that it works? More than happy to meet and discuss where that would fit into the package architecture.

@callum-mcdata
Copy link
Contributor

@prratek in the interest of carrying this to its logical conclusion, is there any reason why you wouldn't cast the timestamp column in the underlying model to the correct timezone that you want the metric to be aggregated at?

@prratek
Copy link
Author

prratek commented Sep 21, 2022

@callum-mcdata sorry it took me a while to get to this! while you could do that, we've been operating on the principle that it's best practice for all timestamps to be in UTC. It "feels" more consistent with the TIMESTAMP type in BigQuery representing a point in time independent of timezone. We throw timezones into the mix only when truncating to date, week, etc because that's when the timezone actually changes the value of your data point, before that it just feels like added coplexity and room for error.

So to answer your question, we could in theory cast all our timestamp columns, but it wouldn't be worth the tradeoff.

@callum-mcdata
Copy link
Contributor

@dave-connors-3 this makes logical sense to me and feels like something that we'd probably want to include in the config functionality similar to null behavior. Does this pass the smell test to you as well?

@prratek Dave has been working on #96 which takes advantage of new functionality in 1.3 to support config settings at the metric level. Does it make sense to you that this would be able to be passed as a config paramater? IE it could be set at the individual metric level OR set at the project level like model configs.

@dave-connors-3
Copy link
Contributor

I think with the assumption that yes, timezones are something that should be configurable, the config makes intuitive sense -- it's closer to a display option than to an indelible quality of a particular metric!

@prratek
Copy link
Author

prratek commented Sep 22, 2022

Makes sense to me too!

@charlotte-tyrt
Copy link

Hi! I am also facing this issue in the dbt_metrics. All the DBT project uses a specific timezone (Europe/Paris) and we would like to have this option in the metrics definition. Hope this option will be added in the future!

@callum-mcdata callum-mcdata added the good first issue Good for newcomers label Jan 4, 2023
@callum-mcdata
Copy link
Contributor

👋 hey folks! I'm doing some new years cleaning of issues and wanted to flag this as a good first issue for anyone in the community looking to contribute. Unfortunately we don't have the bandwidth to take this on right now but would love to work with anyone interested in taking it on.

I'd recommend reading through this PR that @dave-connors-3 released last year. This functionality would most likely take form in a metric config property, which allows for arbitrary values so we don't need to alter core.

We would then need to most likely make changes to gen_base_query, as that serves as the baseline for all data used. I suspect this would be both the calendar dates+dimensions, as well as the join to the calendar.

@callum-mcdata
Copy link
Contributor

After some more thought, I'm actually going to close out this issue - any timezone information would most likely be implemented through the entity construct that we are adding into core and not within this package.

I've linked this issue to corresponding entity issue if anyone is interested in keeping an eye on it!

@github-actions github-actions bot added triage and removed awaiting_response Waiting on a response labels Jan 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers triage
Projects
None yet
Development

No branches or pull requests

4 participants