Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better default behavior of the Coordinates constructor #8107

Merged
merged 5 commits into from
Aug 31, 2023

Conversation

benbovy
Copy link
Member

@benbovy benbovy commented Aug 23, 2023

  • Tests added
  • User visible changes (including notable bug fixes) are documented in whats-new.rst

After working more on Coordinates I realize that the default behavior of its constructor could be more consistent with other Xarray objects. This PR changes this default behavior such that:

  • Pandas indexes are created for dimension coordinates if indexes=None (default). To create dimension coordinates with no index, just pass indexes={}.
  • If another Coordinates object is passed as input, its indexes are also added to the new created object. Since we don't support alignment / merge here, the following call raises an error: xr.Coordinates(coords=xr.Coordinates(...), indexes={...}).

This PR introduces a breaking change since Coordinates are now exposed in v2023.8.0, which has just been released. It is a bit unfortunate but I think it may be OK for a fresh feature, especially if the next release will be soon after this one.

... for any input dimension coordinate, if ``indexes=None``.

Also, if another ``Coordinates`` object is passed, extract its indexes
and raise if ``indexes`` is not None (no align/merge supported here).
Copy link
Member

@shoyer shoyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look great to me, thank you!

I think it is OK getting them in even if they break a few edges cases.

@dcherian
Copy link
Contributor

This PR introduces a breaking change since Coordinates are now exposed in v2023.8.0, which has just been released. It is a bit unfortunate but I think it may be OK for a fresh feature, especially if the next release will be soon after this one.

I also think this is totally fine.

variables = {k: as_variable(v) for k, v in coords.items()}
variables = {}
for name, data in coords.items():
var = as_variable(data, name=name)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still converts any coordinate input in the form {"x": array_like} to an IndexVariable, which triggers data loading by coercing array_like to a pd.Index.

It will be fixed in another PR (#8124).

As a workaround, {"x": ("x", array_like)} does not coerce array_like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbovy I'm pretty sure this workaround currently does not work. I tried to create a Coordinates object from scratch with no indexes and the only way I was able to do it required reverting xarray to v2023.08.0, i.e. just before this PR was merged.

See the end of https://gist.github.com/TomNicholas/d9eb8ac81d3fd214a23b5e921dbd72b7 for full context.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raised with MCVE in #8704

@benbovy
Copy link
Member Author

benbovy commented Aug 31, 2023

Great, thanks @shoyer and @dcherian, merging!

@benbovy benbovy merged commit 0f9f790 into pydata:main Aug 31, 2023
@benbovy benbovy deleted the coordinates-init-default-behavior branch August 31, 2023 07:37
benbovy added a commit to TomNicholas/xarray that referenced this pull request Feb 6, 2024
* ``Coordinates.__init__`` create default indexes

... for any input dimension coordinate, if ``indexes=None``.

Also, if another ``Coordinates`` object is passed, extract its indexes
and raise if ``indexes`` is not None (no align/merge supported here).

* add docstring examples

* fix doctests

* fix tests

* update what's new
benbovy added a commit that referenced this pull request Mar 26, 2024
* as_variable: deprecate converting to IndexVariable

* fix multi-index edge case

* Better default behavior of the Coordinates constructor (#8107)

* ``Coordinates.__init__`` create default indexes

... for any input dimension coordinate, if ``indexes=None``.

Also, if another ``Coordinates`` object is passed, extract its indexes
and raise if ``indexes`` is not None (no align/merge supported here).

* add docstring examples

* fix doctests

* fix tests

* update what's new

* fix deprecation warning

after unintentionally reverted a valid previous change.

* avoid unnecessary auto-creation of index to avoid userwarning

* catch expected FutureWarnings in test_as_variable

* check for coercion to IndexVariable

* whatsnew

---------

Co-authored-by: Benoit Bovy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants