Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Selection of MultiIndex makes following unstack wrong #2619

Closed
fujiisoup opened this issue Dec 18, 2018 · 2 comments
Closed

Selection of MultiIndex makes following unstack wrong #2619

fujiisoup opened this issue Dec 18, 2018 · 2 comments
Labels

Comments

@fujiisoup
Copy link
Member

fujiisoup commented Dec 18, 2018

Code Sample, a copy-pastable example if possible

import numpy as np 
import xarray as xr 

ds = xr.DataArray(np.arange(40).reshape(8, 5), dims=['x', 'y'],  
                  coords={'x': np.arange(8), 'y': np.arange(5)}).stack(xy=['x', 'y']) 
ds.isel(xy=ds['x'] < 4).unstack() 

Out[1]: 
<xarray.DataArray (x: 8, y: 5)>
array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.],
       [15., 16., 17., 18., 19.],
       [nan, nan, nan, nan, nan],
       [nan, nan, nan, nan, nan],
       [nan, nan, nan, nan, nan],
       [nan, nan, nan, nan, nan]])
Coordinates:
  * x        (x) int64 0 1 2 3 4 5 6 7
  * y        (y) int64 0 1 2 3 4

Problem description

After unstack, there are still values that are not selected by the previous isel.
Probably the upstream bug?

Expected Output

Out[1]: 
<xarray.DataArray (x: 8, y: 5)>
array([[ 0.,  1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.,  9.],
       [10., 11., 12., 13., 14.],
       [15., 16., 17., 18., 19.]])
Coordinates:
  * x        (x) int64 0 1 2 3
  * y        (y) int64 0 1 2 3 4

Output of xr.show_versions()

# Paste the output here xr.show_versions() here INSTALLED VERSIONS ------------------ commit: None python: 3.7.1.final.0 python-bits: 64 OS: Linux OS-release: 4.15.0-42-generic machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8

xarray: 0.10.9
pandas: 0.23.4
numpy: 1.15.4
scipy: 1.1.0
netCDF4: 1.4.2
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.0.2.1
PseudonetCDF: None
rasterio: None
iris: None
bottleneck: None
cyordereddict: None
dask: 1.0.0
distributed: 1.25.0
matplotlib: 3.0.1
cartopy: None
seaborn: None
setuptools: 40.5.0
pip: 18.1
conda: None
pytest: 4.0.1
IPython: 7.1.1
sphinx: None

@shoyer
Copy link
Member

shoyer commented Dec 19, 2018

I agree, this is an xarray bug.

It seems that we need to drop unused levels in the MultiIndex as part of unstack(), by invoking pd.MultiIndex.remove_unused_levels().

@shoyer shoyer added the bug label Dec 19, 2018
@fujiisoup
Copy link
Member Author

Thanks @shoyer

I also noticed that the .sel method is working wrong too after the selection,

In [9]: ds.isel(xy=ds['x'] < 4).sel(x=5)  # should be KeyError
Out[9]: 
<xarray.DataArray (y: 0)>
array([], dtype=int64)
Coordinates:
  * y        (y) int64 

Probably better to call pd.MultiIndex.remove_unused_levels() after each selection.
I'll send a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants