-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multiindex selection #2621
Fix multiindex selection #2621
Conversation
Hello @fujiisoup! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on December 24, 2018 at 12:51 Hours UTC |
Should we update levels after indexing or when stacking? Pandas does the later:
The advantage of waiting until unstacking is that removing unused levels could be expensive compared to indexing: it runs in time O(L+N), where L is the size of the original level and N is the length of the new multi-index. |
This issue does not matter only in But what is the good place to do it actually? Probably the best for the efficiency is to keep a flag for it in |
Can you explain why matters for My inclination was just to copy what pandas does (which is only removing unused levels in |
Thanks. I will take a look the source later.
Selection of non-exsiting level variable should be KeyError, but it gives a 0-size index. In [8]: ds = xr.DataArray(np.arange(40).reshape(8, 5), dims=['x', 'y'],
coords={'x': np.arange(8), 'y': np.arange(5)}).stack(xy=['x', 'y'])
In [9]: ds.isel(xy=ds['x'] < 4).sel(x=5) # should be KeyError
Out[9]:
<xarray.DataArray (y: 0)>
array([], dtype=int64)
Coordinates:
* y (y) int64 |
But this problem of |
I moved |
xarray/core/pdcompat.py
Outdated
@@ -0,0 +1,79 @@ | |||
import numpy as np | |||
import pandas as pd | |||
import pandas.core.algorithms as algos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we do a local import here instead the function?
I'm a little nervous that some future version of pandas may drop this (private) module, which would then break imports even though this isn't actually used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Fixed.
import pandas as pd | ||
import pandas.core.algorithms as algos | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you copy the pandas license directly into this file, too? See coding/cftimesindex.py
for an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* master: DEP: drop python 2 support and associated ci mods (pydata#2637) TST: silence warnings from bottleneck (pydata#2638) revert to dev version DOC: fix docstrings and doc build for 0.11.1 Source encoding always set when opening datasets (pydata#2626) Add flake check to travis (pydata#2632) Fix dayofweek and dayofyear attributes from dates generated by cftime_range (pydata#2633) silence import warning (pydata#2635) fill_value in shift (pydata#2470) Flake fixed (pydata#2629) Allow passing of positional arguments in `apply` for Groupby objects (pydata#2413) Fix failure in time encoding for pandas < 0.21.1 (pydata#2630) Fix multiindex selection (pydata#2621) Close files when CachingFileManager is garbage collected (pydata#2595) added some logic to deal with rasterio objects in addition to filepaths (pydata#2589) Get 0d slices of ndarrays directly from indexing (pydata#2625) FIX Don't raise a deprecation warning for xarray.ufuncs.{angle,iscomplex} (pydata#2615) CF: also decode time bounds when available (pydata#2571)
unstack
wrong #2619whats-new.rst
for all changes andapi.rst
for new APIFix using
MultiIndex.remove_unused_levels()