Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't do per-target variables easily #52

Open
willbryant opened this issue Aug 23, 2022 · 5 comments
Open

Can't do per-target variables easily #52

willbryant opened this issue Aug 23, 2022 · 5 comments

Comments

@willbryant
Copy link
Contributor

We have separate GA4 instances for our test & prod sites, which therefore have different property IDs, resulting in different BQ dataset names (in different GCP projects in our case).

The docs say to configure "using the following variables which must be set in your dbt_project.yml file."

vars:
    ga4:
        project: "your_gcp_project"
        dataset: "your_ga4_dataset"
        start_date: "YYYYMMDD" # Earliest date to load

But dbt_project.yml doesn't support variables itself, and so the normal pattern is to override these on a per-environment basis using --vars arguments.

However, --vars defines the variables in the global namespace, not ga4. For example, if we use a command like:

dbt build --target $CI_ENVIRONMENT_NAME --vars "ga4: {project: $DBT_GCP_PROJECT, dataset: $DBT_GA4_DATASET, start_date: $DBT_GA4_START_DATE}"

The variables don't get used, as they're nested under "ga4", rather than being un-nested and presented only to the ga4 module:

03:31:40  Running with dbt=1.1.1
03:31:41  Unable to do partial parsing because config vars, config profile, or config target have changed
03:31:46  Encountered an error:
Compilation Error
  Could not render {{var('project')}}: Required var 'project' not found in config:
  Vars supplied to <Configuration> = {
      "ga4": {
          "dataset": "... value here ...",
          "project": "... value here ...",
          "start_date": ... value here ...
      }
  }

We can pass them without the ga4 nesting:

dbt build --target $CI_ENVIRONMENT_NAME --vars "{project: $DBT_GCP_PROJECT, dataset: $DBT_GA4_DATASET, start_date: $DBT_GA4_START_DATE}"

But then these variables are fully global and potentially conflict with other variables used in the project (eg. start_date in particular).

@willbryant
Copy link
Contributor Author

To resolve this, I propose supporting more specific names, eg. "ga4_dataset" instead of "dataset". We can fall back to the current variables, which are fine for people who only have one set of values for all targets.

Sound OK?

@willbryant
Copy link
Contributor Author

Oh, I'm wrong, we can use env_var in dbt_project, just not call project-defined stuff.

My bad.

@willbryant
Copy link
Contributor Author

willbryant commented Aug 23, 2022

Ah no, that is only for models:, seeds:, and snapshots:. Doesn't work in vars: (dbt-labs/dbt-core#4314).

@willbryant willbryant reopened this Aug 23, 2022
@willbryant
Copy link
Contributor Author

Even better would be to use env_var as the fallback in the var() calls. But that's not particularly standardised.

Thoughts?

@adamribaudo-velir
Copy link
Collaborator

adamribaudo-velir commented Aug 23, 2022

This is a good call-out. Can you post this Q to the DBT package ecosystem Slack channel first to get input? Seems like we can't be the only group that has run into this issue.

There is also a Jan 2023 plan to address project namespaces: https://github.com/dbt-labs/dbt-core/blob/main/docs/roadmap/2022-05-dbt-a-core-story.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants