Allow custom variables inside profiles.yml #2514

dalebradman · 2020-06-08T17:28:59Z

Describe the feature

I would like to be able to create and set custom variables inside ~/.dbt/profiles.yml file and then use these variables within models.

Describe alternatives you've considered

Setting environment variables. However this would require further setup instructions to ask people to add variables to their .profile, .bash_rc, .zshrc etc. which requires additional command line knowledge. It would also be harder to phase more variables in and deprecate them.
Creating a macro that can parse a yaml file (i.e. with open('~/.dbt/profiles.yml') ... ) and setting variables from there. However this is not currently possibly.

Additional context

The goal is to make the initial setup process for installing dbt as simple as possible. A lot of the users will not have command line experience, so it's a big step to get them to update their ~/.dbt/profiles.yml file. Therefore asking them to also maintain their own environment variables is not ideal and we would be able to provision these variables within our GitHub repo (and deploy efficiently).

This would also allow greater flexibility for what we can conditionally do inside dbt_project.yml and each of the individual models.

Who will this benefit?

The driver behind this is for everyone to set a DBT_USER_ALIAS variable that be used as a prefix to table names (inside generate_alias_name) and therefore prevent conflicts from occurring.

It will benefit those without that much command line experience, and subsequently administrators who spend time supporting them.

The text was updated successfully, but these errors were encountered:

jtcohen6 · 2020-06-10T14:58:27Z

Hey @dalebradman, thanks for the writeup.

I don't think we're planning to add anything to ~/.dbt/profiles.yml that doesn't immediately relate to how dbt connects to the warehouse. In the general case, if you need more values that differ based on who or where is running dbt:

environment variable
a macro that, depending on a value from target, returns a conditional set of values

The driver behind this is for everyone to set a DBT_USER_ALIAS variable that be used as a prefix to table names (inside generate_alias_name) and therefore prevent conflicts from occurring.

This sounds like a reasonable use case! I want to understand better. Do all of your development users materialize their models in the same database + schema? What value does each person set for database and schema in profiles.yml?

I'm wondering if you could create a generate_alias_name that prefixes each model with the value of target.schema. We always target.schema as a way to give each development user their own sandbox environment, and to avoid namespace collisions.

vembloud · 2020-09-23T15:02:36Z

I'm having a use case for this, too.

This is what I'm currently using to dynamically select database for my sources (in schema.yml):

sources:
  - name: extract
    database: "{{ env_var('SOURCE_EXTRACT_DB') }}"
    schema: extract
    tables: ...

I have to set env variable before running dbt and change them before running for a different profile/target, which can easily lead to errors. I'd appreciate if I could define custom variables for each target in profiles.yml and then use it like this:

sources:
  - name: extract
    database: "{{ target.source_extract_db }}"

jtcohen6 · 2020-09-23T15:11:47Z

@vembloud How many different environments are you running against? Have you considered conditional logic like:

sources:
  - name: extract
    database: "{{ 'extract_prod' if target.name == 'prod' else 'extract_dev' }}"
    schema: extract
    tables: ...

In general, we find a pretty smooth handoff point from "I'm running with a few environments and can write target-based conditional logic" to "I have a complex deployment with several environments, codified in an orchestration tool that can also reliably set environment variables."

vembloud · 2020-09-23T15:25:51Z

@jtcohen6 Thanks for your quick answer. I have considered the conditional logic, but that would require developers to follow the same target naming pattern. We'd probably stick to env variables, then :)

jtcohen6 · 2020-09-23T15:29:21Z

Fair enough!

I'm going to close this issue, since we're not planning to make the code change originally requested (custom target vars in profiles.yml).

dalebradman · 2020-11-12T10:58:22Z

@jtcohen6 apologies for the late reply, this skipped my inbox!

I want to understand better. Do all of your development users materialize their models in the same database + schema? What value does each person set for database and schema in profiles.yml?

Yes all dev users materialise their tables in the same BigQuery project. We've got a like-for-like match between our prod and dev env which is bound to datasets so creating a project per user is out of the question really (for provisioning/maintenance reasons as well).

What we did for this is asked each user to set schema in profiles.yml to firstname_lastname i.e. dale_b and then modified the generate_alias_name macro to:

{% macro generate_alias_name(custom_alias_name=none, node=none) -%}
    
    {%- if custom_alias_name is none and (target.name.split('-')[0] == 'prod' or target.schema == 'landing') -%}
    
        {{ node.name }}
        
    {%- elif target.name.split('-')[0] == 'prod' or target.schema == 'landing' -%}
    
        {{ custom_alias_name | trim }}_{{ node.name }}
        
    {%- elif custom_alias_name is none -%}
    
        {{ target.schema | trim }}_{{ node.name }}
        
    {%- else -%}
    
        {{ target.schema | trim }}_{{ custom_alias_name | trim }}_{{ node.name }}

    {%- endif -%}
{%- endmacro %}

This has worked very well for us since.

jtcohen6 · 2020-11-12T14:42:50Z

What we did for this is asked each user to set schema in profiles.yml to firstname_lastname i.e. dale_b

This is in line with what we do in development, too. I don't totally follow how there can be namespace conflicts if everyone is building into their own dev schema.

All the same, I'm glad yo've been able to use generate_alias_name and find a solution that works well.

ybressler · 2022-03-07T18:59:35Z

@vembloud How many different environments are you running against? Have you considered conditional logic like:
sources:
  - name: extract
    database: "{{ 'extract_prod' if target.name == 'prod' else 'extract_dev' }}"
    schema: extract
    tables: ...
In general, we find a pretty smooth handoff point from "I'm running with a few environments and can write target-based conditional logic" to "I have a complex deployment with several environments, codified in an orchestration tool that can also reliably set environment variables."

I dislike this solution since it is not declarative – Ideally, a model can determine its state without dependencies of env vars, at least in the yaml. I would much rather a way of storing variables with a yaml – and in the case of sensitive values, have the keys point to env variables.

cwelton · 2022-04-27T22:46:19Z

Adding my voice to the others asking for this. Being able to set variables as part of a profile is a much cleaner solution than putting that configuration into conditional expressions or requiring additional setting of environment variables.

The choice to not implement this as a capability feels like the wrong decision.

inakrin · 2022-06-18T15:21:09Z

The choice to not implement this as a capability feels like the wrong decision.
I agree as well. This feature would be extremely handy for my current project

adam-wrobel · 2022-10-14T09:44:32Z

Would be nice to have this, I have extremely long -vars "{...."} in every invocation because I have multiple depending on the environment, but they dont fit into schema of profile.yml

Combined with #4314 it makes the life very hard

franviera92 · 2023-05-17T21:30:00Z

I would also like this change, if I need to specify a vars per environment, how would I apply this change currently?

jtcohen6 · 2023-06-25T16:19:19Z

Heard!

I am sympathetic to the need for better per-env config options. Writing the same conditional expressions (Jinja-in-yaml) doesn't feel very good. Setting environment variables does feel like The Right Solution, but it's a bigger lift for simpler deployments.

I still believe this specific proposal would require overloading the profile/target, which is really just the set of necessary credentials & configuration for making & managing a data platform connection. It shouldn't be a means of setting arbitrary conditional configuration, which is what dbt makes possible with vars and env_vars.

But: There's no great way today to tie the value of vars or env_vars to the same "environments" that are already (loosely) defined as target names within profiles.yml. If anything, this makes me want to revisit the structure of profiles.yml (long overdue - #1958), in a way that might desire/require some breaking changes, and therefore land in the next major version (v2).

In the shorter term: We've been spending the last several months revisiting & revamping our configuration story, and there are several ideas proposed in #6207. Some of the options:

A dbt.env file for setting environment variables—in version control, to be shared by all users, if you so chose
A dbt_config.yml file that could live in the root directory, or the user's "home" directory (similar to profiles.yml). Should it be possible to set vars in there as well (instead of on the CLI)? That wouldn't support defining multiple different environments, though—and I'm not sure how we would support that, short of requiring an explicit flag be passed every time. (Because this file needs to be read before all other flags / env vars are resolved, we can't have a circular dependency of "read this config file to determine project/profile location" → "read the project file to determine the profile" → "read the profile to determine the default target" → "reread this config file now that we know the target.")

github-actions · 2024-01-04T01:46:59Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions · 2024-01-11T01:47:38Z

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

dalebradman added enhancement New feature or request triage labels Jun 8, 2020

jtcohen6 removed the triage label Jun 10, 2020

jtcohen6 closed this as completed Sep 23, 2020

github-actions bot added the triage label May 17, 2023

jtcohen6 added paper_cut A small change that impacts lots of users in their day-to-day and removed triage labels Jun 25, 2023

jtcohen6 reopened this Jun 25, 2023

binste mentioned this issue Oct 18, 2023

Add feature to automatically convert letter case between database and dbt model binste/dbt-ibis#25

Merged

github-actions bot added the stale Issues that have gone stale label Jan 4, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow custom variables inside profiles.yml #2514

Allow custom variables inside profiles.yml #2514

dalebradman commented Jun 8, 2020

jtcohen6 commented Jun 10, 2020

vembloud commented Sep 23, 2020

jtcohen6 commented Sep 23, 2020

vembloud commented Sep 23, 2020

jtcohen6 commented Sep 23, 2020

dalebradman commented Nov 12, 2020

jtcohen6 commented Nov 12, 2020

ybressler commented Mar 7, 2022

cwelton commented Apr 27, 2022

inakrin commented Jun 18, 2022

adam-wrobel commented Oct 14, 2022

franviera92 commented May 17, 2023

jtcohen6 commented Jun 25, 2023

github-actions bot commented Jan 4, 2024

github-actions bot commented Jan 11, 2024

Allow custom variables inside profiles.yml #2514

Allow custom variables inside profiles.yml #2514

Comments

dalebradman commented Jun 8, 2020

Describe the feature

Describe alternatives you've considered

Additional context

Who will this benefit?

jtcohen6 commented Jun 10, 2020

vembloud commented Sep 23, 2020

jtcohen6 commented Sep 23, 2020

vembloud commented Sep 23, 2020

jtcohen6 commented Sep 23, 2020

dalebradman commented Nov 12, 2020

jtcohen6 commented Nov 12, 2020

ybressler commented Mar 7, 2022

cwelton commented Apr 27, 2022

inakrin commented Jun 18, 2022

adam-wrobel commented Oct 14, 2022

franviera92 commented May 17, 2023

jtcohen6 commented Jun 25, 2023

github-actions bot commented Jan 4, 2024

github-actions bot commented Jan 11, 2024