Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

renaming axis coords and dims seem to keep some memory of older names #9542

Closed
5 tasks
jfpiolle opened this issue Sep 24, 2024 · 4 comments
Closed
5 tasks
Labels
bug needs triage Issue that has not been reviewed by xarray team member

Comments

@jfpiolle
Copy link

What happened?

I think there is a bug when renaming axis dimensions and variables.

loading a file having the following coordinates:

import xarray as xr

ds = xr.open_dataset('model_test.nc')
print(ds.coords)

Coordinates:
  * longitude  (longitude) float32 -180.0 -179.5 -179.0 ... 178.5 179.0 179.5
  * latitude   (latitude) float32 -78.0 -77.5 -77.0 -76.5 ... 79.0 79.5 80.0
    time       datetime64[ns] ...

I want to rename these axes as lat and lon (for the sake of consistency between different datasets):

ds = ds.rename_vars({'longitude': 'lon', 'latitude': 'lat'}).swap_dims({'longitude': 'lon', 'latitude': 'lat'})
print(ds.coords)

which returns:

Coordinates:
  * lon      (lon) float32 -180.0 -179.5 -179.0 -178.5 ... 178.5 179.0 179.5
  * lat      (lat) float32 -78.0 -77.5 -77.0 -76.5 -76.0 ... 78.5 79.0 79.5 80.0
    time     datetime64[ns] ...

so far, everything is fine. Now I want to do a shallow copy of this dataset:

ds2 = ds.copy(deep=False)
print(ds2.coords)

which prints:

Coordinates:
  * lon      (longitude) float32 -180.0 -179.5 -179.0 ... 178.5 179.0 179.5
  * lat      (latitude) float32 -78.0 -77.5 -77.0 -76.5 ... 78.5 79.0 79.5 80.0
    time     datetime64[ns] ...

the lat / lon dimensions are now renamed as in the initial version (latitude/longitude)!

I attached below the input file I used.

test_model.nc.gz

What did you expect to happen?

I was expecting to have the dim and coord names as the dataset I copied.

Minimal Complete Verifiable Example

No response

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-150-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: fr_FR.UTF-8
LOCALE: ('fr_FR', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2

xarray: 2024.9.0
pandas: 2.2.2
numpy: 2.1.0
scipy: 1.14.1
netCDF4: 1.7.1
pydap: None
h5netcdf: 1.3.0
h5py: 3.11.0
zarr: None
cftime: 1.6.4
nc_time_axis: None
iris: None
bottleneck: None
dask: 2024.9.0
distributed: 2024.8.1
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2024.6.1
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 72.2.0
pip: 24.2
conda: None
pytest: 8.3.2
mypy: None
IPython: 8.26.0
sphinx: 7.4.7

@jfpiolle jfpiolle added bug needs triage Issue that has not been reviewed by xarray team member labels Sep 24, 2024
@dcherian
Copy link
Contributor

Does a simple rename work for you instead of rename_vars + swap_dims? swap_dims is a bit broken at the moment.

@max-sixty
Copy link
Collaborator

Possibly a dupe of #8646

@max-sixty
Copy link
Collaborator

Closing because it's a dupe, but I really feel your pain @jfpiolle, I wish our objects were consistent and robust.

I would strongly vote to fix this above any other feature enhancements to indexes (CC @pydata/xarray...)

@max-sixty max-sixty closed this as not planned Won't fix, can't repro, duplicate, stale Oct 3, 2024
@benbovy
Copy link
Member

benbovy commented Oct 3, 2024

I totally agree about prioritizing consistency and robustness!

But at the same time, one of the reasons why the flexible indexes goals and potential have not been realized yet is because of the amount of time it took trying to accommodate "legacy" behavior and API with the new Xarray data model.

Implicit handling of pandas.MultiIndex objects and swap_dims are good examples of such behavior / API: it was much easier to make it work using the concept of dimension coordinates than it is now using the broader concept of (multi-)indexed (non-)dimension coordinates. See #8646 for details.

At some point we need to make some sharp decisions if we want to move forward with this (e.g., leave swap_dims in a semi-broken state with a clear deprecation warning + suggest best alternatives). I'm happy to implement them.

Or if someone wants to take a stab at fixing those issues, please go for it! I did spend too much time on things like this (instead of working on features, documentation, examples, etc.) and it's been frustrating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs triage Issue that has not been reviewed by xarray team member
Projects
None yet
Development

No branches or pull requests

4 participants