Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: points coord from isel/sel_points should be a MultiIndex #1493

Closed
jhamman opened this issue Jul 27, 2017 · 1 comment
Closed

ENH: points coord from isel/sel_points should be a MultiIndex #1493

jhamman opened this issue Jul 27, 2017 · 1 comment

Comments

@jhamman
Copy link
Member

jhamman commented Jul 27, 2017

We implemented the pointwise indexing methods (isel_points and sel_points) before we had MultiIndex support. Would it make sense to update these methods to return objects with coordinates defined as a MultiIndex?

Current behavior:

print('original --> \n', ds)

lons = [-88, -85.9]
lats = [34.2, 31.9]

subset = ds.sel_points(lon=lons, lat=lats, method='nearest')
print('subset --> \n', subset)

yields:

original --> 
 <xarray.Dataset>
Dimensions:  (lat: 224, lon: 464, time: 19709)
Coordinates:
  * lat      (lat) float64 25.06 25.19 25.31 25.44 25.56 25.69 25.81 25.94 ...
  * lon      (lon) float64 -124.9 -124.8 -124.7 -124.6 -124.4 -124.3 -124.2 ...
  * time     (time) float64 5.548e+04 5.548e+04 5.548e+04 5.548e+04 ...
Data variables:
    pcp      (time, lat, lon) float64 nan nan nan nan nan nan nan nan nan ...
subset --> 
 <xarray.Dataset>
Dimensions:  (points: 2, time: 19709)
Coordinates:
    lat      (points) float64 34.19 31.94
    lon      (points) float64 -87.94 -85.94
  * time     (time) float64 5.548e+04 5.548e+04 5.548e+04 5.548e+04 ...
Dimensions without coordinates: points
Data variables:
    pcp      (points, time) float64 0.0 5.698 0.0 0.0 14.66 0.0 0.0 0.0 0.0 ...

Maybe it makes sense to return an object with a MultiIndex like:

new = pd.MultiIndex.from_arrays([subset.lon.to_index(),
                                subset.lat.to_index()],
                                names=['lon', 'lat'])
print(new)
MultiIndex(levels=[[-87.9375, -85.9375], [31.9375, 34.1875]],
           labels=[[0, 1], [1, 0]],
           names=['lon', 'lat'])

xref: #214, #475, #507

@shoyer
Copy link
Member

shoyer commented Jul 27, 2017

If we can finish up @fujiisoup's work (#1473) bringing indexing with broadcasting to xarray, then we can possibly do away with sel_points and isel_points instead.

This doesn't resolve the issue of whether the indexed coordinates should be in a MultiIndex or not. This does make a certain amount of sense, but to be honest, I'm a little divided here -- I'm still not entirely happy with how MultiIndex works in xarray (see #1426 for more discussion).

@jhamman jhamman closed this as completed Sep 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants