-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve typehints of xr.Dataset.__getitem__ #4144
Improve typehints of xr.Dataset.__getitem__ #4144
Conversation
Sadly this is not working with my version of mypy. See python/mypy#7328
The mypy check throws an error: @overload
def __getitem__(self, key: Hashable) -> DataArray:
...
@overload
def __getitem__(self, key: Iterable[Hashable]) -> "Dataset":
... ? Only guessing, though - it was |
@mathause On further consideration, I think it might not be possible to get this to work. This method has three behaviors:
With my limited understanding of Would a good middle ground be something like this?
I think this would work since both the input/outputs of the first one are subtypes of the second one. It's not a complete solution, but it would solve the most common problem of |
Given mypy's use of overloads, I think this is all we can do. If the argument is not Hashable, then return the Union type as before.
Okay. Assuming the tests pass, I think this is ready for review. I tried adding a test, but mypy didn't seem to find problems even with code that I know doesn't work (e.g. In any case, this code does work:
|
Seems this was already discussed in GH3210 (comment) and see also the TODO: Lines 1244 to 1250 in 8f688ea
(although it is not entirely clear to me whether this is actually fixed or not) |
No problem! I think I am done with this one unless you think its important that I document or test this somehow. Can someone review it? |
I took the liberty to rework it, please have a look from typing import Hashable, Mapping
import xarray
ds: xarray.Dataset
class D(Hashable, Mapping):
def __hash__(self): ...
def __getitem__(self, item): ...
def __iter__(self): ...
def __len__(self): ...
reveal_type(ds["foo"])
reveal_type(ds[["foo", "bar"]])
reveal_type(ds[{}])
reveal_type(ds[D()]) mypy output:
|
@@ -1241,13 +1242,25 @@ def loc(self) -> _LocIndexer: | |||
""" | |||
return _LocIndexer(self) | |||
|
|||
def __getitem__(self, key: Any) -> "Union[DataArray, Dataset]": | |||
# FIXME https://github.com/python/mypy/issues/7328 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is fixed now and can be removed? Or perhaps we more it below above the third @overload
and add a comment that Any
means list
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's fixed. Specifically, mypy can't deal in the signature with an overload of Mapping and Hashable. Curiously however, once you add #type: ignore
to the overloaded signature, the actual type inspection works just fine (see my test script above).
@crusaderky Thanks for the re-work. For my own benefit, could you explain why that code worked? I remember writing something very similar, and running into mypy errors. My understanding of how mypy intreprets overload seems incomplete. |
@nbren12 it seems to me that mypy is being overly aggressive when parsing the hinted code (hence why I had to put |
…o-combine * 'master' of github.com:pydata/xarray: (81 commits) use builtin python types instead of the numpy alias (pydata#4170) Revise pull request template (pydata#4039) pint support for Dataset (pydata#3975) drop eccodes in docs (pydata#4162) Update issue templates inspired/based on dask (pydata#4154) Fix failing upstream-dev build & remove docs build (pydata#4160) Improve typehints of xr.Dataset.__getitem__ (pydata#4144) provide a error summary for assert_allclose (pydata#3847) built-in accessor documentation (pydata#3988) Recommend installing cftime when time decoding fails. (pydata#4134) parameter documentation for DataArray.sel (pydata#4150) speed up map_blocks (pydata#4149) Remove outdated note from datetime accessor docstring (pydata#4148) Fix the upstream-dev pandas build failure (pydata#4138) map_blocks: Allow passing dask-backed objects in args (pydata#3818) keep attrs in reset_index (pydata#4103) Fix open_rasterio() for WarpedVRT with specified src_crs (pydata#4104) Allow non-unique and non-monotonic coordinates in get_clean_interp_index and polyfit (pydata#4099) update numpy's intersphinx url (pydata#4117) xr.infer_freq (pydata#4033) ...
* upstream/master: (21 commits) fix typo in error message in plot.py (pydata#4188) Support multiple dimensions in DataArray.argmin() and DataArray.argmax() methods (pydata#3936) Show data by default in HTML repr for DataArray (pydata#4182) Blackdoc (pydata#4177) Add CONTRIBUTING.md for the benefit of GitHub Correct dask handling for 1D idxmax/min on ND data (pydata#4135) use assert_allclose in the aggregation-with-units tests (pydata#4174) Remove old auto combine (pydata#3926) Fix 4009 (pydata#4173) Limit length of dataarray reprs (pydata#3905) Remove <pre> from nested HTML repr (pydata#4171) Proposal for better error message about in-place operation (pydata#3976) use builtin python types instead of the numpy alias (pydata#4170) Revise pull request template (pydata#4039) pint support for Dataset (pydata#3975) drop eccodes in docs (pydata#4162) Update issue templates inspired/based on dask (pydata#4154) Fix failing upstream-dev build & remove docs build (pydata#4160) Improve typehints of xr.Dataset.__getitem__ (pydata#4144) provide a error summary for assert_allclose (pydata#3847) ...
To resolve some common type-related errors, this PR adds some overload type hints to
Dataset.__getitem__
. Now mypy can correctly infer that hashable inputs return DataArrays.xr.Dataset.__getitem__
#4125isort -rc . && black . && mypy . && flake8