-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYPING: Added types for some tests #29205
Conversation
Working around a strange typing issue. See pandas-dev#28394 (comment) for more, but the types on these were being inferred incorrectly by mypy with just the addition of the `allows_duplicate_labels` kwarg.
@@ -32,6 +33,7 @@ | |||
FilePathOrBuffer = Union[str, Path, IO[AnyStr]] | |||
|
|||
FrameOrSeries = TypeVar("FrameOrSeries", bound="NDFrame") | |||
IndexOrSeries = TypeVar("IndexOrSeries", bound="IndexOpsMixin") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this meant to include array likes, eg EA? as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, just Index and Series.
pd.Index, | ||
] # type: List[Union[Type[pd.Index], Type[pd.RangeIndex], Type[pd.Series]]] | ||
left = [pd.RangeIndex(10, 40, 10)] # type: List[Union[Index, Series]] | ||
for cls in index_or_series_params: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn’t this any_numeric_dtype ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to include float16. Aside from that, they look the same at a glance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we barely support float16 would just remove and use the fixture
Any chance you can post the failures this was solving? Would be helpful as background |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good. Some minor things I think can help the typing and legibility.
Nice to parametrize here as well
index_or_series_params = [ | ||
pd.Series, | ||
pd.Index, | ||
] # type: List[Union[Type[pd.Index], Type[pd.RangeIndex], Type[pd.Series]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
] # type: List[Union[Type[pd.Index], Type[pd.RangeIndex], Type[pd.Series]]] | |
] # type: Sequence[Type[Union[pd.Index, pd.Series]]] |
Can shorten this quite a bit if you put the Union
inside of the Type
. Also, if you decide to use Sequence
instead of list it covariant, so can implicitly handle RangeIndex
being a subclass of Index
pandas/tests/test_base.py
Outdated
from pandas.core.indexes.datetimelike import DatetimeIndexOpsMixin | ||
import pandas.util.testing as tm | ||
|
||
index_or_series = [Index, Series] # type: List[Type[IndexOpsMixin]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you not reusing the definition in pandas._typing
here for a particular reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed this one.
@@ -790,6 +791,17 @@ def tick_classes(request): | |||
return request.param | |||
|
|||
|
|||
index_or_series_params = [pd.Index, pd.Series] # type: IndexOrSeries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# type: List[IndexOrSeries]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be List[Type[IndexOrSeries]]
? Surprised this passed checks as is; seems like something wonky going on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch.
agree that strange it doesn't fail
index_or_series_params = [ | ||
pd.Series, | ||
pd.Index, | ||
] # type: Sequence[Type[Union[pd.Index, pd.Series]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List[Type[IndexOrSeries]]
import pandas.core.strings as strings | ||
import pandas.util.testing as tm | ||
from pandas.util.testing import assert_index_equal, assert_series_equal | ||
|
||
index_or_series_params = [Index, Series] # type: List[Type[PandasObject]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List[Type[IndexOrSeries]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this, I get
pandas/tests/test_strings.py:18: error: Type variable "pandas._typing.IndexOrSeries" is unbound
Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably don't need to create IndexOrSeries in _typing. so the union or the shared class is fine, but should probably be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@simonjayhawkins do you know generally how mypy is inferencing these? I guess from the original list of errors I'm not sure where Type[PandasObject]
even comes into play. Wondering if there isn't a core issue we are missing here, but I may also just be overlooking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my first reaction is that it is a mypy error. Revealed type is 'builtins.list[def (data: Any =, Any =, Any =, name: Any =, Any =, Any =, **Any) -> pandas.core.base.PandasObject]'
seems iffy.
but agree, that it may be better to try and determine the cause rather than making these changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if types are getting mangled from pytest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think pytest is relevant...
1 from typing import List, Type, cast
2
3 import pandas as pd
4
5 reveal_type([pd.Index, pd.Series])
6
7 [pd.Index, pd.Series]
8
9 cast(List[Type[pd.core.base.PandasObject]], [pd.Index, pd.Series])
$ mypy test.py
test.py:5: note: Revealed type is 'builtins.list[def (data: Any =, Any =, Any =, name: Any =, Any =, Any =, Any =) -> pandas.core.base.PandasObject]'
test.py:5: error: List item 0 has incompatible type "Type[Index]"; expected "Type[PandasObject]"
test.py:7: error: List item 0 has incompatible type "Type[Index]"; expected "Type[PandasObject]"
(pandas-dev)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think the reveal_type([pd.Index, pd.Series])
should be giving an error which is why i suspect a mypy issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.....so when I run the above this is all I get:
$ mypy -V
mypy 0.720
$ mypy test.py
test.py:5: note: Revealed type is 'builtins.list[builtins.type*]'
Wonder why we get different output.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to clone #28394
Here's a somewhat reduced example (edit, simplified some more) # ok.py
from typing import List, Type, cast
class Base:
pass
class Series(Base):
def __init__(self, a=None):
pass
class Index(Base):
def __new__(cls, a=None, **kwargs) -> "Index":
result = object.__new__(cls)
return result
reveal_type([Index, Series])
[Index, Series]
cast(List[Type[Base]], [Index, Series]) # bug.py
from typing import List, Type, cast
class Base:
pass
class Series(Base):
def __init__(self, a=None, z=None):
pass
class Index(Base):
def __new__(cls, a=None, **kwargs) -> "Index":
result = object.__new__(cls)
return result
reveal_type([Index, Series])
[Index, Series]
cast(List[Type[Base]], [Index, Series]) so the only difference is the new keyword. ( The outputs are bug.py:20: note: Revealed type is 'builtins.list[def (a: Any =, Any =) -> bug.Base]'
bug.py:20: error: List item 0 has incompatible type "Type[Index]"; expected "Type[Base]"
bug.py:21: error: List item 0 has incompatible type "Type[Index]"; expected "Type[Base]"
ok.py:20: note: Revealed type is 'builtins.list[builtins.type*]' |
A couple observations:
|
May be onto something... This passes
# bug.py
from typing import List, Type, cast, Any, Mapping, Hashable
class Base:
pass
class Series(Base):
def __init__(self, a: Any = None, z: bool=False) -> None:
pass
class Index(Base):
def __new__(cls, a: Any = None, **kwargs: Mapping[Hashable, Any]) -> "Index":
result = object.__new__(cls) # type: Index
return result
reveal_type([Index, Series])
[Index, Series]
cast(List[Type[Base]], [Index, Series]) |
I asked about this on the typing Gitter and one of the devs shared this as where the "best" super type determination gets made: Obviously here we have PandasObject and IndexOpsMixin as shared super types, and in this case it appears mypy infers PandasObject Haven't stepped through the code but may provide some insight. In an case from comments seems like this behavior might be tenuous at best, so something to consider |
might be clearer to use |
Unfortunately, I can't get the suggestion in #29205 (comment) to type |
Given that this seems like a mypy bug, any objection to just adding |
my preference would be a
|
I don't think this is a bug. It's just ambiguous as to what our intent is here. Does making the TypeVar point to PandasObject work? I guess having a TypeVar use a Mixin might not make a ton of sense anyway |
another alternative to |
The assignment to |
That's a great example. I think the expression will always be ambiguous as to the intent whereas in the assignment we can clearly state what is expected. Can't we do that here in this PR but change A Literal |
to be clear. mypy is infering a type for a list of two items, and then is immediately producing an error with the inferred type. i.e. the message in the error includes so pretty sure this is a bug and what would need to happen to un-ignore is the bug to be fixed. |
I think I'll close this. Will ignore as needed on my other PR. |
Working around a strange typing issue. See
#28394 (comment)
for more, but the types on these were being inferred incorrectly by
mypy with just the addition of the
allows_duplicate_labels
kwarg.cc @simonjayhawkins & @WillAyd. Hopefully I got things correct. I copied the use of TypeVar for FrameOrSeries with IndexOrSeries.