Rationalize quoting configs + properties #2986

jtcohen6 · 2020-12-31T09:43:59Z

Describe the feature

Picks up from issues like #2468 and #2975, which are narrower in scope and offer more straightforward near-term fixes

Why do we call it quoting when configuring database/schema/identifier names, but then quote when describing properties of column names?
Why does each adapter have to implement its quoting character sort-of twice? (Adapter quoting should use self.Relation.quote_character #2243)
There's also quote_columns, which is a seed-only config item that lives on its own level but surely belongs inside the quoting config item (or will it be quote?!) as columns
Not to mention quoted, which while not itself a config, returns the quoted version of a column name from a Relation based on the configs above

Instead, we should have a single config/property, and I think it should be quote. This would take over from the current project-level quoting config:

quote:
  database: true|false    # or `project` on dbt-bigquery
  schema: true|false      # or `dataset` on dbt-bigquery
  identifier: true|false
  columns: true|false

The quote: {columns: true} would also replace quote_columns as a bespoke config for seeds. If that config is specified in dbt_project.yml, it can be superseded by:

setting quote: {} inside the config() block for a specific moel
quote: true|false set for a specific column in models/*.yml (it's implied that this really means quote: {column: true}
in a post-Set configs in schema.yml files #2401 world, a model can set its quote: {} config within models/*.yml, too

If quote is not set, it falls back to the default behavior of the adapter plugin, which also sets the character used for quoting (almost always " or `).

Questions

Here's what we have in the docs FAQs today for sources:

By default, dbt will not quote the database, schema, or identifier for the source tables that you've specified.

Should sources start respecting project-level quote settings? Or they continue to act independently, but we should enable turning this config-property on or off for all sources in dbt_project.yml:

sources:
  quote:
    schema: true

Describe alternatives you've considered

Retaining all of these configs/properties/adapter methods and documenting them exceptionally well so as to avoid confusion

Additional context

This isn't specific to any one database, though it is likely most helpful on databases that support special characters if quoted (Postgres, Redshift) or are particularly sensitive to quoting (Snowflake).

There's a round-up of all the known documentation related to quoting in dbt-labs/docs.getdbt.com#3518.

The text was updated successfully, but these errors were encountered:

leahwicz · 2021-06-10T17:10:58Z

Keep accepting the old way but don't advertise (don't break how users are currently using it)
Tricky- getting the inheritance to work
Related to the configs vs properties battle

nathaniel-may · 2021-06-10T17:12:20Z

What is the exit criteria for this issue?
- A new project-level config called "quote" that matches the behavior of the current quote and quoting configs.
- It can be used in config blocks to override project-level quote settings in old or new style.
What are the high-level items of the work that need to be done (i.e. create x, split out y, etc.)
- Add a new config to yaml parsing.
- Add a new config to the internal config logic
- Make sure overrides work (we probably don't get this for free.)
- Testing
What are the open questions on this issue that still need answers?
- Are is there anything this would do beyond what's expressible today or is it just ergonomic? A: Just ergonomic.
- would we deprecate the old way of doing these things or just let them both ride? A: Not initially.
- How do we want to handle sources? (question in ticket since they seem to work independently)
Are there blockers/prerequisites to starting this work?
- I think it could be started.
- It's related to configs vs property work so make sure those don't overlap. neat inheritance of these configs would need to take into account that some quoting stuff is dbt configs. some quoting stuff is dbt properties.

github-actions · 2022-05-11T02:11:42Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please remove the stale label or comment on the issue, or it will be closed in 7 days.

jtcohen6 · 2022-05-11T11:13:35Z

I still care about this one :)

adamcunnington-mlg · 2022-07-13T09:28:59Z

I really care about this one!

All I want to do is ensure that dbt quotes all column names that I reference or create (via sql selects) and my only option right now is to explicitly define every single column in every single model. I've not actually tried doing that but I suspect that will only quote OUTPUT columns in a model, not columns that I select during my sql.

github-actions · 2023-01-10T02:01:45Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions · 2023-07-17T02:10:44Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions · 2024-01-14T01:50:11Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

dbeatty10 · 2024-02-27T15:37:56Z

Here's a couple other issues related to quoting (specifically about applying proper escaping prior to quoting):

[Feature] Escape " within adapter.quote dbt-adapters#105
- quoted identifiers for database objects like catalog/database, schema, table/view relation, and columns
Why is string_literal() not safe? dbt-adapters#106
- string literals used to create column values

dbeatty10 · 2024-03-12T20:59:19Z

And another:

[Bug] Custom test name causes dbt test to fail when persisting failures with --store-failures #9752

gwenwindflower · 2024-06-20T18:27:38Z

this is in adapters now, but also adding that seeds are not consistently quoted and it would be cool if they were also a config option under the proposed quote config.

dbt-labs/dbt-adapters/issues/178

dbeatty10 · 2024-07-11T15:25:42Z

About the problem

When using a case-sensitive database like Snowflake, it becomes challenging to use a mix of ASCII-only model names and non-ASCII model names.

Example:

models
└  only_ascii_char.sql
└  ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ.sql

To run the model "ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ," the identifier needs to be enclosed in double quotes. The only current option is the project-level quoting parameter, so I configure it as follows:

# dbt_project.yml
quoting:
  identifier: true

However, this causes even the only_ascii_char model to have its identifier enclosed in quotes after compilation, which means that in Snowflake, the table can only be used in lowercase:

create or replace table database.schema."only_ascii_char" ...

When attempting to avoid this by using an alias, a different issue arises:

# ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ.yml
models:
  - name: ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ
    alias: '"ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ"'

At runtime, this results in an error due to ambiguous model name detection:

Compilation Error in model ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ (models/ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ.sql)
  When searching for a relation, dbt found an approximate match. Instead of gussing
  which relation to use, dbt will move on. Please delete database.schema."ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ", or rename it to be less ambiguous.
  Searched for :  database.schema."ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ"
  Found: "database"."schema"."ｎｏｎ＿ａｓｃｉｉ＿ｃｈａｒｓ"

For people like myself who use a non-ASCII native language, it’s important to provide models in their business language (i.e., the ubiquitous language). I would like to contribute to adding this feature! Where would be the best place to start tackling this issue?

As outlined in this linked issue, is it necessary to consolidate or organize the quote options?
dbt-labs/docs.getdbt.com#3518

(I have also heard that non-ASCII characters cannot be used in unit tests. That may also be related to this one, however, I believe this should be addressed in a separate issue, so I will create a new one for that.)

dbeatty10 · 2024-10-31T14:36:45Z

Thanks for reaching out and providing such a nice example @t0momi219 !

It sounds like your goal has two parts:

Be able to write queries in Snowflake that include double quotes for the ubiquitous language (that contains non-ASCII characters):
```
select * from database.target.schema."注文"
```
Be able to write queries in Snowflake that do not include double quotes for other models (that only contain ASCII characters):
```
select * from database.schema.lowercase_ascii
```

So you're seeking a way to configure quoting identifier: true | false on a per-model basis rather than a project-wide basis.

Does that sound correct?

t0momi219 · 2024-11-01T00:49:37Z

Hi @dbeatty10,

you're seeking a way to configure quoting identifier: true | false on a per-model basis rather than a project-wide basis.

Exactly. Being able to override the quoting setting specified at the project level on a per-model basis would increase flexibility and be ideal.

dbeatty10 · 2024-11-06T23:47:06Z

From @jtcohen6:

IIRC - the reason why this hasn't been supported historically is that it makes operating on the relational cache very tricky, because dbt would need cache entries & lookups to be case-sensitive depending on each object’s quoting config (rather than on/off overall for all cached relation names).

@t0momi219 Thanks for you interest in working on this! This feature is tricky enough that we'd need to invest a significant amount of time and effort regardless if a community member works on it or if we do it ourselves (see below for some details). In those cases, we'd want to do it ourselves rather than accept community contributions. Unfortunately, this isn't a priority for us in the near-term, so we don't plan on tackling it anytime soon.

jtcohen6 added enhancement New feature or request 1.0.0 Issues related to the 1.0.0 release of dbt labels Dec 31, 2020

jtcohen6 mentioned this issue Dec 31, 2020

“check” snapshot strategy fails if columns need to be quoted #2975

Closed

5 tasks

jtcohen6 mentioned this issue Mar 1, 2021

Quoting in Snowflake Merge Statements #3129

Closed

4 tasks

leahwicz mentioned this issue May 19, 2021

Detail and scope 1.0.0 issues #3370

Closed

18 tasks

jtcohen6 removed the 1.0.0 Issues related to the 1.0.0 release of dbt label Nov 11, 2021

jtcohen6 mentioned this issue Nov 20, 2021

quoting of column names in nested fields dbt-labs/dbt-bigquery#71

Closed

jtcohen6 mentioned this issue May 3, 2022

[Bug] Incremental compilation error not handling column with quotes #4422

Closed

1 task

github-actions bot added the stale Issues that have gone stale label May 11, 2022

jtcohen6 removed the stale Issues that have gone stale label May 11, 2022

github-actions bot added the stale Issues that have gone stale label Jan 10, 2023

jtcohen6 removed the stale Issues that have gone stale label Jan 17, 2023

jtcohen6 mentioned this issue Jan 17, 2023

dbt Constraints / model contracts #6271

Merged

14 tasks

jtcohen6 mentioned this issue May 26, 2023

[CT-2421] [Epic] Multi-project collaboration - Milestone 2 #7372

Closed

dbeatty10 mentioned this issue Jun 13, 2023

Quoting for database catalogs, schemas, tables, and columns dbt-labs/docs.getdbt.com#3518

Open

1 task

github-actions bot added the stale Issues that have gone stale label Jul 17, 2023

dbeatty10 removed the stale Issues that have gone stale label Jul 17, 2023

This was referenced Oct 4, 2023

[CT-3173] [Bug] Postgres + Grants + Username-With-Dashes does not work dbt-labs/dbt-postgres#55

Closed

Add quotes to usernames when revoking and granting table permissions #8809

Closed

github-actions bot added the stale Issues that have gone stale label Jan 14, 2024

dbeatty10 mentioned this issue Feb 26, 2024

[Feature] Escape " within adapter.quote dbt-labs/dbt-adapters#105

Open

2 tasks

github-actions bot removed the stale Issues that have gone stale label Feb 27, 2024

FlorianVc mentioned this issue Apr 25, 2024

dbt quoting does not work as expected dbt-msft/dbt-sqlserver#497

Closed

dbeatty10 mentioned this issue Jul 16, 2024

[CT-2819] default__alter_relation_add_remove_columns macro does not use quoting with case sensitive Snowflake relation dbt-labs/dbt-adapters#256

Draft

4 tasks

dbeatty10 mentioned this issue Jul 24, 2024

[Bug] Data tests do not quote columns names when mixed case #10477

Closed

2 tasks

graciegoheen added the paper_cut A small change that impacts lots of users in their day-to-day label Oct 29, 2024

dbeatty10 added the awaiting_response label Oct 31, 2024

github-actions bot added triage and removed awaiting_response labels Nov 1, 2024

dbeatty10 removed the triage label Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rationalize quoting configs + properties #2986

Rationalize quoting configs + properties #2986

jtcohen6 commented Dec 31, 2020 •

edited by dbeatty10

Loading

leahwicz commented Jun 10, 2021

nathaniel-may commented Jun 10, 2021

github-actions bot commented May 11, 2022

jtcohen6 commented May 11, 2022

adamcunnington-mlg commented Jul 13, 2022

github-actions bot commented Jan 10, 2023

github-actions bot commented Jul 17, 2023

github-actions bot commented Jan 14, 2024

dbeatty10 commented Feb 27, 2024

dbeatty10 commented Mar 12, 2024

gwenwindflower commented Jun 20, 2024

dbeatty10 commented Jul 11, 2024

alison985 commented Aug 16, 2024

t0momi219 commented Oct 31, 2024

dbeatty10 commented Oct 31, 2024

t0momi219 commented Nov 1, 2024

dbeatty10 commented Nov 6, 2024

Rationalize quoting configs + properties #2986

Rationalize quoting configs + properties #2986

Comments

jtcohen6 commented Dec 31, 2020 • edited by dbeatty10 Loading

Describe the feature

Questions

Describe alternatives you've considered

Additional context

leahwicz commented Jun 10, 2021

nathaniel-may commented Jun 10, 2021

github-actions bot commented May 11, 2022

jtcohen6 commented May 11, 2022

adamcunnington-mlg commented Jul 13, 2022

github-actions bot commented Jan 10, 2023

github-actions bot commented Jul 17, 2023

github-actions bot commented Jan 14, 2024

dbeatty10 commented Feb 27, 2024

dbeatty10 commented Mar 12, 2024

gwenwindflower commented Jun 20, 2024

dbeatty10 commented Jul 11, 2024

alison985 commented Aug 16, 2024

t0momi219 commented Oct 31, 2024

About the problem

dbeatty10 commented Oct 31, 2024

t0momi219 commented Nov 1, 2024

dbeatty10 commented Nov 6, 2024

jtcohen6 commented Dec 31, 2020 •

edited by dbeatty10

Loading