Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support hierarchical config setting for SavedQueryExport configs #9065

Merged

Conversation

QMalcolm
Copy link
Contributor

resolves #8956

Problem

In #8950 we added Exports to SavedQueries. Exports have configs and we want these configs to be setable from a project's dbt_project.yml, i.e. hierarchical. Originally this work was to be part of #8892 in #8950, however due to deadlines and some pesky edge cases, we separated out this work.

Solution

In #8956 we outlined some paths forward, and we ended up taking none of them. We did not make ExportConfig a typical BaseConfig like all other configs, and we didn't use the calculate_node_config nor create any abstraction of it. Instead we decided to continue treating ExportConfig like a non-config object. An ExportConfig is mapped from an Export.config dictionary. We've taken the extra step to merge in any relevant SavedQueryConfig attributes with a preference for attributes defined directly on the ExportConfig. Additionally because the SavedQueryConfig is produced from the calculate_node_config process, we also get relevant configs in the dbt_project.yml.

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

This brings the Export config object more in line with how other config
objects are specified in the unparsed definition. It allows for specifying
of extra configs, although they won't get propagate to the final config.
This allows for specifying `ExportConfig` options at the `SavedQueryConfig` level.
This also therefore allows these options to be specified in the dbt_project.yml
config. The plan in the follow up commit is to merge the `SavedQueryConfig` options
into all configs of `Exports` belonging to the saved query.

There are a couple caveots to call out:
1. We've used `schema` instead of `schema_name` on the `SavedQueryConfig` despite
it being called `schema_name` on the `ExportConfig`. This is because need `schema_name`
to be the name of the property on the `ExportConfig`, but `schema` is the user facing
specification.
2. We didn't add the `ExportConfig` `alias` property to the `SavedQueryConfig` This
is because `alias` will always be specific to a single export, and thus it doesn't
make sense to allow defining it on the `SavedQueryConfig` to then apply to all
`Exports` belonging to the `SavedQuery`
…om project config

Export configs will now inherit from saved query configs, with a preference
for export config specifications. That is to say an export config will inherity
a config attr from the saved query config only if a value hasn't been supplied
on the export config directly. Additionally because the saved query config has
a similar relationship with the project config, exports configs can inherit
from the project config (again with a preference for export config specifications).
@QMalcolm QMalcolm requested a review from a team as a code owner November 13, 2023 18:47
@QMalcolm QMalcolm requested a review from gshank November 13, 2023 18:47
@cla-bot cla-bot bot added the cla:yes label Nov 13, 2023
Copy link

codecov bot commented Nov 13, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (7fddd6e) 86.51% compared to head (5ec2363) 86.52%.
Report is 13 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #9065   +/-   ##
=======================================
  Coverage   86.51%   86.52%           
=======================================
  Files         179      179           
  Lines       26508    26539   +31     
=======================================
+ Hits        22934    22963   +29     
- Misses       3574     3576    +2     
Flag Coverage Δ
integration 83.36% <100.00%> (-0.05%) ⬇️
unit 64.77% <30.76%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -407,6 +408,8 @@ class SavedQueryConfig(BaseConfig):
default_factory=dict,
metadata=MergeBehavior.Update.meta(),
)
export_as: Optional[ExportDestinationType] = None
schema: Optional[str] = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we dropping support for alias?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally left out alias because including it felt weird 🙃 If a saved query has multiple exports, I don't think we'd want them to all get aliased to to same thing.

I didn't feel there was a great place to leave a comment about it (feels weird to comment in a class about why it doesn't have an attribute), so I put the informartion in the commit message

We didn't add the ExportConfig alias property to the SavedQueryConfig. This
is because alias will always be specific to a single export, and thus it doesn't
make sense to allow defining it on the SavedQueryConfig to then apply to all
Exports belonging to the SavedQuery

Happy to add a comment to the class if that feels like the correct approach. Alternatively, if we do want to add alias we can, although I do worry that could cause other problems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Have we documented alias in user documentation yet? If so it might make sense to support some no-op backwards-compatible interface here given we are backporting this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will run through it with @graciegoheen / @jtcohen6. As of right now exports as a whole don't appear to be documented https://docs.getdbt.com/docs/build/saved-queries 🙃

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right! We closed out this docs issue until exports is fully ready to go dbt-labs/docs.getdbt.com#4381

Relevant slack thread here -> https://dbt-labs.slack.com/archives/C05K4R7KZ5Z/p1699478482262759

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QMalcolm Here's one new PR for saved queries:
dbt-labs/docs.getdbt.com#4469

…a_name

I somehow wrote a really weird, but also valid, conditional statement. Previously
the conditional was
```
if combined.get("schema") is not combined.get("schema_name") is None:
```
which basically checked whether `schema` was a boolean that didn't match
the boolean of whether `schema_name` was None. This would pretty much
always evaluate to True because `schema` should be a string or none, not
a bool, and thus would never match the right hand side. Crazy. It has now
been fixed to do the thing we want to it to do. If `schema` isn't `None`,
and `schema_name` is `None`, then set `schema_name` to have the value of
`schema`.
assert export1.config.schema_name == "my_custom_export_schema"
assert export1.config.schema_name != saved_query.config.schema

# assert Export `my_export` has its configs defined from the saved_query because they should take priority
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

saved_query = result.result.saved_queries["saved_query.test.test_saved_query"]
assert saved_query.config.export_as == ExportDestinationType.VIEW

# change export's `export_as` to `TABLE` via project config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@MichelleArk MichelleArk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@QMalcolm QMalcolm merged commit c2f7d75 into main Nov 14, 2023
51 checks passed
@QMalcolm QMalcolm deleted the qmalcolm--8956-hierarchical-saved-query-export-config-support branch November 14, 2023 19:13
github-actions bot pushed a commit that referenced this pull request Nov 14, 2023
* Add test asserting `SavedQuery` configs can be set from `dbt_project.yml`

* Allow extraneous properties in Export configs

This brings the Export config object more in line with how other config
objects are specified in the unparsed definition. It allows for specifying
of extra configs, although they won't get propagate to the final config.

* Add `ExportConfig` options to `SavedQueryConfig` options

This allows for specifying `ExportConfig` options at the `SavedQueryConfig` level.
This also therefore allows these options to be specified in the dbt_project.yml
config. The plan in the follow up commit is to merge the `SavedQueryConfig` options
into all configs of `Exports` belonging to the saved query.

There are a couple caveots to call out:
1. We've used `schema` instead of `schema_name` on the `SavedQueryConfig` despite
it being called `schema_name` on the `ExportConfig`. This is because need `schema_name`
to be the name of the property on the `ExportConfig`, but `schema` is the user facing
specification.
2. We didn't add the `ExportConfig` `alias` property to the `SavedQueryConfig` This
is because `alias` will always be specific to a single export, and thus it doesn't
make sense to allow defining it on the `SavedQueryConfig` to then apply to all
`Exports` belonging to the `SavedQuery`

* Begin inheriting configs from saved query config, and transitively from project config

Export configs will now inherit from saved query configs, with a preference
for export config specifications. That is to say an export config will inherity
a config attr from the saved query config only if a value hasn't been supplied
on the export config directly. Additionally because the saved query config has
a similar relationship with the project config, exports configs can inherit
from the project config (again with a preference for export config specifications).

* Correct conditional in export config building for map schema to schema_name

I somehow wrote a really weird, but also valid, conditional statement. Previously
the conditional was
```
if combined.get("schema") is not combined.get("schema_name") is None:
```
which basically checked whether `schema` was a boolean that didn't match
the boolean of whether `schema_name` was None. This would pretty much
always evaluate to True because `schema` should be a string or none, not
a bool, and thus would never match the right hand side. Crazy. It has now
been fixed to do the thing we want to it to do. If `schema` isn't `None`,
and `schema_name` is `None`, then set `schema_name` to have the value of
`schema`.

* Update parameter names in `_get_export_config` to be more verbose

(cherry picked from commit c2f7d75)
QMalcolm added a commit that referenced this pull request Nov 15, 2023
* Add test asserting `SavedQuery` configs can be set from `dbt_project.yml`

* Allow extraneous properties in Export configs

This brings the Export config object more in line with how other config
objects are specified in the unparsed definition. It allows for specifying
of extra configs, although they won't get propagate to the final config.

* Add `ExportConfig` options to `SavedQueryConfig` options

This allows for specifying `ExportConfig` options at the `SavedQueryConfig` level.
This also therefore allows these options to be specified in the dbt_project.yml
config. The plan in the follow up commit is to merge the `SavedQueryConfig` options
into all configs of `Exports` belonging to the saved query.

There are a couple caveots to call out:
1. We've used `schema` instead of `schema_name` on the `SavedQueryConfig` despite
it being called `schema_name` on the `ExportConfig`. This is because need `schema_name`
to be the name of the property on the `ExportConfig`, but `schema` is the user facing
specification.
2. We didn't add the `ExportConfig` `alias` property to the `SavedQueryConfig` This
is because `alias` will always be specific to a single export, and thus it doesn't
make sense to allow defining it on the `SavedQueryConfig` to then apply to all
`Exports` belonging to the `SavedQuery`

* Begin inheriting configs from saved query config, and transitively from project config

Export configs will now inherit from saved query configs, with a preference
for export config specifications. That is to say an export config will inherity
a config attr from the saved query config only if a value hasn't been supplied
on the export config directly. Additionally because the saved query config has
a similar relationship with the project config, exports configs can inherit
from the project config (again with a preference for export config specifications).

* Correct conditional in export config building for map schema to schema_name

I somehow wrote a really weird, but also valid, conditional statement. Previously
the conditional was
```
if combined.get("schema") is not combined.get("schema_name") is None:
```
which basically checked whether `schema` was a boolean that didn't match
the boolean of whether `schema_name` was None. This would pretty much
always evaluate to True because `schema` should be a string or none, not
a bool, and thus would never match the right hand side. Crazy. It has now
been fixed to do the thing we want to it to do. If `schema` isn't `None`,
and `schema_name` is `None`, then set `schema_name` to have the value of
`schema`.

* Update parameter names in `_get_export_config` to be more verbose

(cherry picked from commit c2f7d75)
QMalcolm added a commit that referenced this pull request Nov 15, 2023
* Add test asserting `SavedQuery` configs can be set from `dbt_project.yml`

* Allow extraneous properties in Export configs

This brings the Export config object more in line with how other config
objects are specified in the unparsed definition. It allows for specifying
of extra configs, although they won't get propagate to the final config.

* Add `ExportConfig` options to `SavedQueryConfig` options

This allows for specifying `ExportConfig` options at the `SavedQueryConfig` level.
This also therefore allows these options to be specified in the dbt_project.yml
config. The plan in the follow up commit is to merge the `SavedQueryConfig` options
into all configs of `Exports` belonging to the saved query.

There are a couple caveots to call out:
1. We've used `schema` instead of `schema_name` on the `SavedQueryConfig` despite
it being called `schema_name` on the `ExportConfig`. This is because need `schema_name`
to be the name of the property on the `ExportConfig`, but `schema` is the user facing
specification.
2. We didn't add the `ExportConfig` `alias` property to the `SavedQueryConfig` This
is because `alias` will always be specific to a single export, and thus it doesn't
make sense to allow defining it on the `SavedQueryConfig` to then apply to all
`Exports` belonging to the `SavedQuery`

* Begin inheriting configs from saved query config, and transitively from project config

Export configs will now inherit from saved query configs, with a preference
for export config specifications. That is to say an export config will inherity
a config attr from the saved query config only if a value hasn't been supplied
on the export config directly. Additionally because the saved query config has
a similar relationship with the project config, exports configs can inherit
from the project config (again with a preference for export config specifications).

* Correct conditional in export config building for map schema to schema_name

I somehow wrote a really weird, but also valid, conditional statement. Previously
the conditional was
```
if combined.get("schema") is not combined.get("schema_name") is None:
```
which basically checked whether `schema` was a boolean that didn't match
the boolean of whether `schema_name` was None. This would pretty much
always evaluate to True because `schema` should be a string or none, not
a bool, and thus would never match the right hand side. Crazy. It has now
been fixed to do the thing we want to it to do. If `schema` isn't `None`,
and `schema_name` is `None`, then set `schema_name` to have the value of
`schema`.

* Update parameter names in `_get_export_config` to be more verbose

(cherry picked from commit c2f7d75)
QMalcolm added a commit that referenced this pull request Nov 15, 2023
…) (#9074)

* Add test asserting `SavedQuery` configs can be set from `dbt_project.yml`

* Allow extraneous properties in Export configs

This brings the Export config object more in line with how other config
objects are specified in the unparsed definition. It allows for specifying
of extra configs, although they won't get propagate to the final config.

* Add `ExportConfig` options to `SavedQueryConfig` options

This allows for specifying `ExportConfig` options at the `SavedQueryConfig` level.
This also therefore allows these options to be specified in the dbt_project.yml
config. The plan in the follow up commit is to merge the `SavedQueryConfig` options
into all configs of `Exports` belonging to the saved query.

There are a couple caveots to call out:
1. We've used `schema` instead of `schema_name` on the `SavedQueryConfig` despite
it being called `schema_name` on the `ExportConfig`. This is because need `schema_name`
to be the name of the property on the `ExportConfig`, but `schema` is the user facing
specification.
2. We didn't add the `ExportConfig` `alias` property to the `SavedQueryConfig` This
is because `alias` will always be specific to a single export, and thus it doesn't
make sense to allow defining it on the `SavedQueryConfig` to then apply to all
`Exports` belonging to the `SavedQuery`

* Begin inheriting configs from saved query config, and transitively from project config

Export configs will now inherit from saved query configs, with a preference
for export config specifications. That is to say an export config will inherity
a config attr from the saved query config only if a value hasn't been supplied
on the export config directly. Additionally because the saved query config has
a similar relationship with the project config, exports configs can inherit
from the project config (again with a preference for export config specifications).

* Correct conditional in export config building for map schema to schema_name

I somehow wrote a really weird, but also valid, conditional statement. Previously
the conditional was
```
if combined.get("schema") is not combined.get("schema_name") is None:
```
which basically checked whether `schema` was a boolean that didn't match
the boolean of whether `schema_name` was None. This would pretty much
always evaluate to True because `schema` should be a string or none, not
a bool, and thus would never match the right hand side. Crazy. It has now
been fixed to do the thing we want to it to do. If `schema` isn't `None`,
and `schema_name` is `None`, then set `schema_name` to have the value of
`schema`.

* Update parameter names in `_get_export_config` to be more verbose

(cherry picked from commit c2f7d75)

Co-authored-by: Quigley Malcolm <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CT-3299] [SL] Support Hierarchical Configs for Saved Queries and their Exports
4 participants