Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP for MyPy CI Integration #25622

Closed
wants to merge 13 commits into from
Closed

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Mar 10, 2019

progress towards #25601

Not complete just pushing for now. Main goal here is to get a CI run that works for any modules existing with comments.

@pep8speaks
Copy link

pep8speaks commented Mar 10, 2019

Hello @WillAyd! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 1835:5: E124 closing bracket does not match visual indentation

Comment last updated at 2019-03-11 19:40:33 UTC

mypy.ini Outdated
@@ -0,0 +1,8 @@
[mypy]
follow_imports=silent
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is less than ideal but without this the errors are out of control. My motivation for setting this comes from the Mypy documentation:

https://mypy.readthedocs.io/en/latest/running_mypy.html#following-imports

To wit:

If you are planning on adding type hints to a large, existing code base, we recommend you start by trying to make your entire codebase (including files that do not use type hints) pass under --follow-imports=normal. This is usually not too difficult to do: mypy is designed to report as few error messages as possible when it is looking at unannotated code.

If doing this is intractable, we recommend passing mypy just the files you want to type check and use --follow-imports=silent. Even if mypy is unable to perfectly type check a file, it can still glean some useful information by parsing it (for example, understanding what methods a given object has). See Using mypy with an existing codebase for more recommendations.

mypy.ini Outdated
[mypy-numpy.*]
ignore_missing_imports=True

[mypy-pandas._libs.*]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally doesn't look like Mypy likes Cython imports. Have this in the config but also noted specifically on imports within the code.

There's probably a better approach to this just haven't found yet

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the PEP, you should be able to use .pyi/stub files to annotate compiled extensions. Not ideal though since they'd have to stay in-sync with the extension

@@ -0,0 +1,17 @@
pandas/core/dtypes/base.py
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all of the files that currently have some sort of hints in them. The motivation for this comes form this part in the documentation:

https://mypy.readthedocs.io/en/latest/running_mypy.html#reading-a-list-of-files-from-a-file

So thinking for initial CI runs we can whitelist the modules we want to run against, though this is ideally just a temporary file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear when running this from the project root you would do mypy @mypy_whitelist.txt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Naddiseo any chance you have experience with this? Reading from docs I think suggested approach will be to whitelist particular modules at the outset and slowly open up as more are added.

I'd like to avoid having two files to control configuration here but I don't see an easy way in the .ini file to control which modules actually get checked

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd, not with the whitelist approach. I went with the blacklist approach instead, and had a bunch of modules and packages with ignore_errors set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. Was that done per package / module in the config file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I'm just ignoring packages. But, I think at one point I was doing both, and I seem to remember doing a mixture where I'd ignore everything in a package except a specific file.
I think it looked like:

[mypy-package.subpackage]
ignore_errors=True
[mypy-package.subpackage.module]
ignore_errors=False

However, I don't remember if it worked or not.


import numpy as np

from pandas._libs import Timestamp, groupby as libgroupby
from pandas._libs import Timestamp, groupby as libgroupby # type: ignore
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See note on Cython imports

@WillAyd
Copy link
Member Author

WillAyd commented Mar 10, 2019

One issue in existings that I wasn't sure how to resolve stems from the ExtensionBlock in internals, specifically this line:

def _can_hold_na(self):

It's superclass and all other subclasses derived therefrom treat this as an attribute, whereas here it's a property. I think that's the root of the issue

@Naddiseo
Copy link
Contributor

One issue in existings that I wasn't sure how to resolve stems from the ExtensionBlock in internals, specifically this line:

pandas/pandas/core/internals/blocks.py

Line 1693 in e28ae70
def _can_hold_na(self):

It's superclass and all other subclasses derived therefrom treat this as an attribute, whereas here it's a property. I think that's the root of the issue

Interesting issue, it might be a bug in mypy. Although, it might also be a legitimate typing error: my guess would be that mypy is saying "you told me it's a property, so I expect that I can do @Class.property.setter, however, you also said it's None in which case None.setter is an error"

@jreback jreback added the Code Style Code style, linting, code_checks label Mar 10, 2019
.gitignore Outdated
@@ -111,4 +112,4 @@ doc/build/html/index.html
# Windows specific leftover:
doc/tmp.sv
env/
doc/source/savefig/
doc/source/savefig/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need a newline here

pandas/core/groupby/groupby.py
pandas/core/internals/blocks.py
pandas/core/internals/managers.py
pandas/core/common.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this not possible in the setup file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I was hoping that Mypy_path in the ini file would work but didn't have any luck. I didn't also see a very good way to make this readable therein since the ini would expect this to be a comma separated list.

With that said this may just be temporary anyway - would ideally like to run on the entire code. Modules without existing types would be ignored by default anyway

@codecov
Copy link

codecov bot commented Mar 10, 2019

Codecov Report

Merging #25622 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25622      +/-   ##
==========================================
+ Coverage   91.26%   91.26%   +<.01%     
==========================================
  Files         173      173              
  Lines       52968    52974       +6     
==========================================
+ Hits        48339    48345       +6     
  Misses       4629     4629
Flag Coverage Δ
#multiple 89.84% <100%> (ø) ⬆️
#single 41.72% <100%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/dtypes/base.py 100% <100%> (ø) ⬆️
pandas/core/internals/blocks.py 94.08% <100%> (ø) ⬆️
pandas/core/groupby/groupby.py 97.21% <100%> (ø) ⬆️
pandas/core/internals/managers.py 93.93% <100%> (+0.01%) ⬆️
pandas/core/common.py 98.39% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e28ae70...2abc6f5. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 10, 2019

Codecov Report

Merging #25622 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25622      +/-   ##
==========================================
+ Coverage   91.26%   91.26%   +<.01%     
==========================================
  Files         173      173              
  Lines       52968    52992      +24     
==========================================
+ Hits        48339    48364      +25     
+ Misses       4629     4628       -1
Flag Coverage Δ
#multiple 89.84% <100%> (ø) ⬆️
#single 41.73% <90.24%> (+0.02%) ⬆️
Impacted Files Coverage Δ
pandas/core/arrays/integer.py 96.32% <ø> (ø) ⬆️
pandas/core/dtypes/base.py 100% <100%> (ø) ⬆️
pandas/core/arrays/base.py 98.27% <100%> (+0.02%) ⬆️
pandas/core/arrays/datetimelike.py 97.68% <100%> (ø) ⬆️
pandas/core/internals/blocks.py 94.08% <100%> (ø) ⬆️
pandas/core/indexes/datetimelike.py 98.52% <100%> (ø) ⬆️
pandas/core/arrays/period.py 98.54% <100%> (+0.01%) ⬆️
pandas/core/base.py 97.76% <100%> (ø) ⬆️
pandas/core/arrays/timedeltas.py 88.22% <100%> (+0.02%) ⬆️
pandas/core/frame.py 96.79% <100%> (ø) ⬆️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e28ae70...ac7e768. Read the comment docs.

@WillAyd
Copy link
Member Author

WillAyd commented Mar 10, 2019

Changes so far were low effort fixes. Here's what's left when running mypy @mypy_whitelist.txt:

pandas/core/base.py:853: error: "IndexOpsMixin" has no attribute "_values"; maybe "_map_values"?
pandas/core/base.py:978: error: "IndexOpsMixin" has no attribute "values"
pandas/core/arrays/array_.py:235: error: Item "str" of "Union[str, Any, ExtensionDtype, None]" has no attribute "construct_array_type"
pandas/core/arrays/array_.py:235: error: Item "None" of "Union[str, Any, ExtensionDtype, None]" has no attribute "construct_array_type"
pandas/core/arrays/datetimelike.py:365: error: "DatetimeLikeArrayMixin" has no attribute "_data"
pandas/core/arrays/datetimelike.py:466: error: Name 'Scalar' is not defined
pandas/core/arrays/datetimelike.py:482: error: Argument 1 to "len" has incompatible type "Union[int, Sequence[int], Sequence[bool], slice]"; expected "Sized"
pandas/core/arrays/datetimelike.py:486: error: Argument 1 to "len" has incompatible type "Union[int, Sequence[int], Sequence[bool], slice]"; expected "Sized"
pandas/core/arrays/datetimelike.py:487: error: Argument 1 to "len" has incompatible type "Union[int, Sequence[int], Sequence[bool], slice]"; expected "Sized"
pandas/core/arrays/datetimelike.py:503: error: Item "Tuple[Any, ...]" of "Union[type, Tuple[type]]" has no attribute "__name__"
pandas/core/arrays/datetimelike.py:505: error: "DatetimeLikeArrayMixin" has no attribute "_data"
pandas/core/arrays/sparse.py:80: error: Incompatible types in assignment (expression has type "Tuple[str, str, str]", base class "_DtypeOpsMixin" defined the type as "Tuple[]")
pandas/core/arrays/period.py:189: error: "PeriodDtype" has no attribute "freq"
pandas/core/arrays/period.py:278: error: Read-only property cannot override read-write property
pandas/core/arrays/period.py:544: error: The first argument to Callable must be a list of types or "..."
pandas/core/arrays/period.py:550: error: Unsupported operand type for unary - ("Union[Index, ExtensionArray, Any]")
pandas/core/arrays/period.py:784: error: Incompatible types in assignment (expression has type "None", variable has type "PeriodDtype")
pandas/core/arrays/integer.py:34: error: Incompatible types in assignment (expression has type "None", base class "ExtensionDtype" defined the type as "str")
pandas/core/arrays/integer.py:36: error: Incompatible types in assignment (expression has type "None", base class "ExtensionDtype" defined the type as "Type[Any]")
pandas/core/indexes/datetimelike.py:65: error: "Callable[[Any], Any]" has no attribute "fget"
pandas/core/indexes/datetimelike.py:66: error: "Callable[[Any], Any]" has no attribute "fget"
pandas/core/indexes/datetimelike.py:67: error: "Callable[[Any], Any]" has no attribute "fget"
pandas/core/indexes/datetimelike.py:69: error: "Callable[[Any], Any]" has no attribute "fget"
pandas/core/indexes/datetimelike.py:70: error: "Callable[[Any], Any]" has no attribute "fget"
pandas/core/indexes/datetimelike.py:136: error: "DatetimeLikeArrayMixin" has no attribute "_data"
pandas/core/indexes/datetimelike.py:138: error: Decorated property not supported
pandas/core/indexes/datetimelike.py:208: error: Name '_box_values' already defined on line 72
pandas/core/indexes/period.py:68: error: Definition of "_data" in base class "DatetimeIndexOpsMixin" is incompatible with definition in base class "Index"
pandas/core/indexes/period.py:68: error: Definition of "_data" in base class "DatetimeIndexOpsMixin" is incompatible with definition in base class "DatetimelikeDelegateMixin"
pandas/core/indexes/period.py:68: error: Definition of "map" in base class "DatetimeIndexOpsMixin" is incompatible with definition in base class "Index"
pandas/core/indexes/period.py:68: error: Definition of "_format_with_header" in base class "DatetimeIndexOpsMixin" is incompatible with definition in base class "Index"
pandas/core/indexes/period.py:68: error: Definition of "isin" in base class "DatetimeIndexOpsMixin" is incompatible with definition in base class "Index"
pandas/core/internals/blocks.py:2218: error: Definition of "_can_hold_na" in base class "ExtensionBlock" is incompatible with definition in base class "DatetimeBlock"

Some of these will actually require a refactor of code. Probably want to split off into pre-cursor or at least separate PRs to tackle

@WillAyd
Copy link
Member Author

WillAyd commented Mar 20, 2019

Somewhat superseded by #25789 - there are some things here I'd like to extract out but easier to start fresh than merge master at this point

@WillAyd WillAyd closed this Mar 20, 2019
@WillAyd WillAyd deleted the mypy-setup branch January 16, 2020 00:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants