-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow --select
and --exclude
multiple times
#7169
Conversation
@jtcohen6 we need to make a functional decision between one of these three options:
We have same options for I'm most attracted to option 1 as I think it would the least surprising to the end user. |
Besides
So we'd need to make the make a similar decision amongst the three options for each of them as well. They could either be covered in this PR or deferred to a separate issue(s). |
@dbeatty10 Thanks for laying out the options so clearly! I agree that option 0 (status quo) is the most confusing & least desirable. My initial instinct was for option 2 (raise an error), but I think you've convinced me that option 1 is preferable. It's actually the default expected behavior for many CLIs (including |
@@ -178,7 +183,7 @@ | |||
"Space-delimited listing of node properties to include as custom keys for JSON output " | |||
"(e.g. `--output json --output-keys name resource_type description`)" | |||
), | |||
type=list, | |||
type=tuple, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context
When I left this as type=list
, a couple tests failed:
But when I updated it to type=tuple
, they passed again:
Although it is not enforced, there is a comment here that says:
MultiOption options must be specified with type=tuple or type=ChoiceTuple
Suggestion
To avoid unforeseen issues in the future, maybe we should enforce that the configured type
is one of the following whenever cls=MultiOption
?
- subclass of
tuple
- instance of
ChoiceTuple
One option for enforcing this within the __init__
method of MultiOption
:
option_type = kwargs.pop("type", None)
if inspect.isclass(option_type):
assert issubclass(option_type, tuple), "type must be tuple or ChoiceTuple not {}".format(option_type)
else:
assert isinstance(option_type, ChoiceTuple), "type must be tuple or ChoiceTuple not {}".format(option_type)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented this suggestion in 33c6cc4
Shall we apply option 1 to each of the following as well? Or just
I tested both I think those four are all the ones using our custom Even when specifying |
@dbeatty10 I agree with applying this change consistently across the board, to all existing Going forward, it sounds like any new param with |
…=tuple` (or `ChoiceTuple`) and `multiple=True`
# validate that multiple=True | ||
multiple = kwargs.pop("multiple", None) | ||
msg = f"MultiOption named `{self.name}` must have multiple=True (rather than {multiple})" | ||
assert multiple, msg | ||
|
||
# validate that type=tuple or type=ChoiceTuple | ||
option_type = kwargs.pop("type", None) | ||
msg = f"MultiOption named `{self.name}` must be tuple or ChoiceTuple (rather than {option_type})" | ||
if inspect.isclass(option_type): | ||
assert issubclass(option_type, tuple), msg | ||
else: | ||
assert isinstance(option_type, ChoiceTuple), msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very open to feedback here -- I just stuffed quick-n-dirty validation in the easiest place possible.
if value: | ||
value = tuple(flatten(value)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why, but the value can be None
. We only want to flatten it when it has a non-None value.
Maybe it's just me, but should the behavior be set intersection instead of set union? Using a Pandas data frame as an example, I know that: df
.filter(f1)
.filter(f2) will return me rows that are in the intersection of filters It's definitely a less ergonomic option (imagining a use case like Slim CI), but feels more mathematically correct. As I write this, I'm not very convinced of this line of thinking, but I'll leave it up to you to ponder 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes looks good to me implementation wise!! I will leave the choice of union vs intersect to you @dbeatty.
Good eyes @aranke ! 🤩 Suppose a user expresses
Key question
OptionsIn this case, we have two options of how to combine each of the arguments:
Explanations
Pros of set union
Cons of set union
Pros of set intersection
Cons of set intersection
ProposalMy belief is that a majority of dbt users will expect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
resolves #3003 ## What are you changing in this pull request and why? See dbt-labs/dbt-core#7169 in addition to the issues this resolves. ## Pages updated - [Upgrading to v1.5](https://deploy-preview-3093--docs-getdbt-com.netlify.app/guides/migration/versions/upgrading-to-v1.5#breaking-changes) - [YAML Selectors](https://deploy-preview-3093--docs-getdbt-com.netlify.app/reference/node-selection/yaml-selectors#exclude) - [How does selection work?](https://deploy-preview-3093--docs-getdbt-com.netlify.app/reference/node-selection/syntax#how-does-selection-work) ## 🎩 [Netlify preview](https://deploy-preview-3093--docs-getdbt-com.netlify.app/reference/node-selection/yaml-selectors#exclude) ### 1.4 <img width="500" alt="image" src="https://user-images.githubusercontent.com/44704949/228322779-8b2f4d69-17a5-4326-8ced-7add389ebea0.png"> ### 1.5 <img width="500" alt="image" src="https://user-images.githubusercontent.com/44704949/228322883-08443f70-ea22-43db-a190-728b6e2e3b3c.png"> ## Checklist Uncomment if you're publishing docs for a prerelease version of dbt (delete if not applicable): - [x] Add versioning components, as described in [Versioning Docs](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/versioningdocs.md) - [x] Add a note to the prerelease version [Migration Guide](https://github.com/dbt-labs/docs.getdbt.com/tree/current/website/docs/guides/migration/versions) - [x] Review the [Content style guide](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/content-style-guide.md) and [About versioning](https://github.com/dbt-labs/docs.getdbt.com/blob/current/contributing/single-sourcing-content.md#adding-a-new-version) so my content adheres to these guidelines.
resolves #7158
resolves #6800
Problem
Currently, if a user does the following, only the last selector (
users_rollup
) will be take effect and earlier ones (users
) will not:Selection syntax is confusing enough as-is, and the fact that it accepts multiples but only uses the last one makes it even more confusing.
Options
#7158 describes two options for how to move forward:
--select
as if it were a union. Similar for--exclude
.--select
is supplied. Same thing for--exclude
.This PR fleshes out option 1.
Note: most of it would also be applicable if we choose to implement option 2 instead.
Implicitly, there's also an option 0 that keeps the status quo. If we choose option 0, I'd still advocate that we cherry-pick the two test cases in 2824f4a so we can actively affirm that this is the expected and desired behavior.
Breaking change
This is described as a breaking change in the changelog because the behavior changes for certain edge cases (see above).
🎩
Our custom
MultiOption
class itself doesn't currently have any unit tests. Its main form of testing is the functional tests that depend upon it.We expect that any parameter with MultiOption should also have (
type=tuple
ortype=ChoiceTuple
) andmultiple=True
.Here's the exceptions that are raised when either of those expectations are violated:
Not
type=tuple
and nottype=ChoiceTuple
:Not
multiple=True
:Checklist
changie new
to create a changelog entry