Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] add selector: method to combine YAML selectors with command line selection #10992

Open
3 tasks done
mroy-seedbox opened this issue Nov 13, 2024 · 1 comment
Open
3 tasks done
Labels
enhancement New feature or request triage

Comments

@mroy-seedbox
Copy link

mroy-seedbox commented Nov 13, 2024

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

Introduce a new selector method in order to explicitly select a selector on the command line, and be able to combine the result with the full power of CLI selection (union, intersection, exclude, etc.).

This would solve #5009, #10991, and #10596 (and probably others).

A selector method already exists (see also #1628, #4821, and #4827), but only for selector inheritance within YAML selectors. Unfortunately, this method is not available on the command line (which is what this feature would provide).

YAML selectors as subsets

But the real power in this would be to turn YAML selectors into predefined selection subsets (rather than being the primary selection). For example, instead of: dbt ls -s something,tag:one,tag:two,tag:three,etc., we could define a tag_one_two_three_etc selector, and then simply do dbt ls -s something,selector:tag_one_two_three_etc. This would provide tremendous reusability for cases that do not necessarily deserve new selectors to be created, and would offer the possibility to encapsulate repetitive or complex selection logic in predefined & reusable "selection components".

Composition over inheritance

Composition over inheritance is a very valuable software design pattern/approach, and dbt users would gain from it being applied to selectors. Right now, the command line offers composition but no inheritance, whereas YAML selectors offer inheritance but "no composition". YAML composition is possible, but at the cost of preemptively creating a multitude of selectors for every possible combination, which very quickly become unmanageable (so basically it is multiple inheritance, rather than composition, and that is exactly where composition over inheritance shines/comes to the rescue). So you could imagine: selector_1_with_abc, selector_1_with_def, selector_1_with_abc_and_def, selector_2_with_abc, and on and on... and before you know it, YAML selectors are no longer DRY.

I think this is a serious question/consideration: is selection in dbt meant to be composed, or inherited? Inheritance usually works better for really simple software classes... but in the end, composition almost always offers a superior alternative. An obvious example is that everyone would favor List(<Animal>) over ListOfAnimals (which would inherit from List). Selector inheritance really doesn't seem like the right direction.

Grouping & Negating

Another powerful use case is to use YAML selectors to create negative selections, which could then be used in --exclude selector:not_something. There is an example described here.

Dynamic selection

This would also be really useful to better handle cases where the CLI selection is generated dynamically (possibly outside of dbt entirely) in order to produce a much DRYer selection syntax. For example, existing YAML selectors could be offered as parameters to choose from in the selection (with the possibility to compose a selection from them).

To me it seems to be a no-brainer at this point that we must be able to dynamically compose selectors together with other selection criteria (including other selectors). That's how model selection started, and it was the right direction. YAML selectors helped with the ability to define selectors as version-controlled code, but at the cost of losing the ability to dynamically compose selections.

Describe alternatives you've considered

There are none. Right now, we are forced to choose between YAML selectors or CLI selectors, but we cannot combine both.

Who will this benefit?

Dbt users with advanced selection needs. Especially those who want to tweak selectors for custom or ad-hoc use cases.

It would also benefit every DBT user by enabling YAML selectors to be used as subsets on the CLI.

Are you interested in contributing this feature?

Maybe. Not quite sure where to start.

Anything else?

No response

@mroy-seedbox
Copy link
Author

mroy-seedbox commented Nov 13, 2024

Another useful example: dbt run -s something --exclude selector:large_models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage
Projects
None yet
Development

No branches or pull requests

1 participant