Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move Variable aggregations to NamedArray #8304

Merged
merged 17 commits into from
Oct 17, 2023

Conversation

dcherian
Copy link
Contributor

@dcherian dcherian commented Oct 13, 2023

TODO:
- [  ] the handling of global keep_attrs is not tested well. This should
be breaking DataArray/Dataset tests!
- [ ] Look at `numeric_only` in NAMED_ARRAY_OBJECT
- [ ] Do we want to support `axis`?
@dcherian dcherian mentioned this pull request Oct 13, 2023
20 tasks
@dcherian dcherian requested a review from andersy005 October 14, 2023 02:23
# to True by NamedArray.mean.
# Instead we need to make VariableAggregations mixin with .mean,
# and delegate to NamedArray.reduce setting keep_attrs explicitly
result = self._to_named_array().reduce(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re the comment: This is quite annoying.

@dcherian dcherian force-pushed the named-array-reductions branch from a613abe to b94d78d Compare October 14, 2023 03:08
@dcherian dcherian marked this pull request as ready for review October 14, 2023 03:09
@andersy005 andersy005 added the run-benchmark Run the ASV benchmark workflow label Oct 14, 2023
@github-actions github-actions bot added the topic-NamedArray Lightweight version of Variable label Oct 14, 2023
@andersy005 andersy005 removed the run-benchmark Run the ASV benchmark workflow label Oct 14, 2023
Copy link
Member

@andersy005 andersy005 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks solid to me 👍🏽! thank you for working on this, @dcherian

@dcherian
Copy link
Contributor Author

Don't know how to fix these.

xarray/namedarray/core.py: note: In member "_get_axis_num" of class "NamedArray":
Generated Cobertura report: /home/runner/work/xarray/xarray/mypy_report/cobertura.xml
xarray/namedarray/core.py:353: error: Returning Any from function declared to return "int"  [no-any-return]
xarray/namedarray/core.py: note: In member "reduce" of class "NamedArray":
xarray/namedarray/core.py:476: error: Function is missing a type annotation for one or more arguments  [no-untyped-def]
xarray/core/variable.py: note: In member "reduce" of class "Variable":
xarray/core/variable.py:1707: error: Signature of "reduce" incompatible with supertype "NamedArray"  [override]
xarray/core/variable.py:1707: note:      Superclass:
xarray/core/variable.py:1707: note:          def reduce(self, func: Callable[..., Any], dim: str | Iterable[Hashable] | ellipsis | None = ..., axis: int | Sequence[int] | None = ..., keepdims: bool = ..., **kwargs: Any) -> Any
xarray/core/variable.py:1707: note:      Subclass:
xarray/core/variable.py:1707: note:          def reduce(self, func: Callable[..., Any], dim: str | Iterable[Hashable] | ellipsis | None = ..., axis: int | Sequence[int] | None = ..., keep_attrs: bool | None = ..., keepdims: bool = ..., **kwargs: Any) -> Variable

@andersy005
Copy link
Member

@dcherian, are we okay w/ merging this as is, and fixing the typing issue in a separate pull request? i personally don't know how to address the typing issues.

@andersy005
Copy link
Member

Cc @Illviljan for help if you happen to have time :)

@max-sixty
Copy link
Collaborator

are we okay w/ merging this as is, and fixing the typing issue in a separate pull request?

No strong view on fixing vs. ignoring, but I would vote quite strongly to explicitly ignore errors, rather than break the tests, since it's difficult to oversee tests whose good state is a break...

@dcherian
Copy link
Contributor Author

I think the issue is real and tied to not having keep_attrs in NamedArray.reduce but having it in Variable.reduce.

        # Noe that the call order for Variable.mean is
        #    Variable.mean -> NamedArray.mean -> Variable.reduce
        #    -> NamedArray.reduce

One option would be to use a to_named_array and from_named_array pattern.

@dcherian
Copy link
Contributor Author

One option would be to use a to_named_array and from_named_array pattern.

didn't work.

@dcherian dcherian changed the title Move aggregations to NamedArray Move Variable aggregations to NamedArray Oct 17, 2023
@dcherian dcherian added the plan to merge Final call for comments label Oct 17, 2023
@andersy005 andersy005 added the run-benchmark Run the ASV benchmark workflow label Oct 17, 2023
@andersy005 andersy005 enabled auto-merge (squash) October 17, 2023 17:58
@andersy005 andersy005 removed the run-benchmark Run the ASV benchmark workflow label Oct 17, 2023
@andersy005 andersy005 merged commit 88285f9 into pydata:main Oct 17, 2023
30 checks passed
self,
func: Callable[..., Any],
dim: Dims = None,
axis: int | Sequence[int] | None = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is NamedArray big fans of axis inputs? I thought this was a mistake in xarray.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I forgot to raise this as an issue. It's definitely a mistake for Dataset. DataArray and Variable do have ordered dimensions so in principle it's OK but discouraged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plan to merge Final call for comments topic-html-repr topic-NamedArray Lightweight version of Variable
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants