Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial pandas.typing Module #25884

Merged
merged 19 commits into from
Mar 30, 2019
Merged

Initial pandas.typing Module #25884

merged 19 commits into from
Mar 30, 2019

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Mar 26, 2019

ref #25791 maybe more of a conversation starter than complete PR but wanted to stub out some types that appear commonly in pandas code-base.

Haven't used these yet as wanted to agree on type first. I wouldn't be surprised if we start seeing some issues pop up that would require real development changes once these get utilized

@jreback @jbrockmendel

@WillAyd WillAyd added the Typing type annotations, mypy/pyright type checking label Mar 26, 2019
@jreback
Copy link
Contributor

jreback commented Mar 26, 2019

cool, can you use these in a couple of places? and some examples / docs-strings?

@codecov
Copy link

codecov bot commented Mar 26, 2019

Codecov Report

Merging #25884 into master will decrease coverage by 0.01%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25884      +/-   ##
==========================================
- Coverage   91.48%   91.46%   -0.02%     
==========================================
  Files         175      176       +1     
  Lines       52885    52892       +7     
==========================================
  Hits        48380    48380              
- Misses       4505     4512       +7
Flag Coverage Δ
#multiple 90.03% <0%> (-0.02%) ⬇️
#single 41.82% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/typing.py 0% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d404460...e230500. Read the comment docs.

1 similar comment
@codecov
Copy link

codecov bot commented Mar 26, 2019

Codecov Report

Merging #25884 into master will decrease coverage by 0.01%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25884      +/-   ##
==========================================
- Coverage   91.48%   91.46%   -0.02%     
==========================================
  Files         175      176       +1     
  Lines       52885    52892       +7     
==========================================
  Hits        48380    48380              
- Misses       4505     4512       +7
Flag Coverage Δ
#multiple 90.03% <0%> (-0.02%) ⬇️
#single 41.82% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/typing.py 0% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d404460...e230500. Read the comment docs.

@codecov
Copy link

codecov bot commented Mar 26, 2019

Codecov Report

Merging #25884 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #25884      +/-   ##
==========================================
- Coverage    91.8%   91.79%   -0.01%     
==========================================
  Files         174      175       +1     
  Lines       52536    52540       +4     
==========================================
  Hits        48231    48231              
- Misses       4305     4309       +4
Flag Coverage Δ
#multiple 90.35% <100%> (ø) ⬆️
#single 41.88% <83.33%> (-0.07%) ⬇️
Impacted Files Coverage Δ
pandas/io/gcs.py 80% <100%> (ø) ⬆️
pandas/io/parsers.py 95.69% <100%> (ø) ⬆️
pandas/io/s3.py 89.47% <100%> (ø) ⬆️
pandas/_typing.py 100% <100%> (ø)
pandas/io/gbq.py 75% <0%> (-12.5%) ⬇️
pandas/core/frame.py 96.79% <0%> (-0.12%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1d4c89f...7e31345. Read the comment docs.

pandas/typing.py Outdated
from pandas._libs.tslibs import Period, NaT, Timedelta, Timestamp


DateTimeLike = Union[datetime, timedelta, Period, Timedelta, Timestamp]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do/should the numpy versions go in here?

@WillAyd
Copy link
Member Author

WillAyd commented Mar 27, 2019 via email

@jbrockmendel
Copy link
Member

I don’t think so but not sure I follow - what’s the link between NumPy version and types?

Bad phrasing on my part. I was referring to np.datetime64 and np.timedelta64 objects. Many of our functions that accept datetime inputs will also accept np.datetime64 inputs. Ditto for timedelta.

@WillAyd
Copy link
Member Author

WillAyd commented Mar 27, 2019

Sure just added the FilePathOrBuffer type to io.parsers. CI will fail with the following:

pandas/io/parsers.py:445: error: Item "str" of "Union[str, Path, IO[Any]]" has no attribute "close"
pandas/io/parsers.py:445: error: Item "Path" of "Union[str, Path, IO[Any]]" has no attribute "close"

There's an implicit assumption in the codebase that a local variable should_close dictates whether or not a close method gets called, which Mypy doesn't like. I suppose that is rather fragile so refactor could be in line

@jbrockmendel any thoughts on where best to try out the DateTimeLike types?

@jbrockmendel
Copy link
Member

any thoughts on where best to try out the DateTimeLike types?

Not really. I guess I'm not clear on the use case. Aside from running mypy, are we actually doing anything with these?

@WillAyd
Copy link
Member Author

WillAyd commented Mar 27, 2019

Use case would be to improve readability and maintainability of code base and in theory reducing bugs. There's some refactor that will go hand in hand with it (as you might have seen with some of the IO stuff above).

This is just starting things off and there's a long way to go. I think I might split off the datetimelike stuff for this for now and just focus on IO. Can see how that goes and learn from there

# Union[FilePathOrBuffer, s3fs.S3File, gcsfs.GCSFile]
# though mypy handling of conditional imports is difficult.
# See https://github.com/python/mypy/issues/1297
fp_or_buf, _, compression, should_close = get_filepath_or_buffer(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mentioned in the comments but I changed the variable name here from filepath_or_buffer to fp_or_buf to intentionally NOT shadow the parameter from the signature.

As mentioned in the comment this local variable could potentially introduce new types for S3 and GCP and I don't think there is a great way with typing to statically analyze conditional imports like those just yet, so it's a clearer delimitation IMO to assign the return of this function to a separate variable

@jreback jreback added this to the 0.25.0 milestone Mar 28, 2019
@jreback
Copy link
Contributor

jreback commented Mar 28, 2019

see some suggestions: #25829 (can be a followup)

@jreback
Copy link
Contributor

jreback commented Mar 28, 2019

looks fine. better to start small. merge if you like (or add more up 2 you)

Type Hints and ``pandas.typing``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In accordance with :pep:`484` pandas has introduced Type Hints and a new ``pandas.typing`` module containing aliases for idiomatic pandas types into the code base. We will be continually adding annotations to the code base to improve readability, reduce code maintenance and proactively identify bugs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we make this public?
(which mean: maintaining compatibility) Maybe at least not initially?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I don't see a downside to making private. Can do that on next iteration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK just made this private and updated documentation - lmk what you think


.. _whatsnew_0250.enhancements.typing:

Type Hints and ``pandas.typing``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would remove this entirely. if its private it is not to be relied upon.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs a tests as indicated, other lgtm. ping on green.

@@ -0,0 +1,4 @@
from pathlib import Path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test that asserts this in pandas/tests/types/test_api.py

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea no problem. So the existing test(s) exclude any privately named modules:

result = sorted(f for f in dir(namespace) if not f.startswith('_'))

Are you asking to revisit that logic or simply add a test with this as an exception to make sure it lives there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i c. nvm then. though I think we should actually check and lock down all modules (so add back the private ones). in a new PR / issue though.

@jreback jreback merged commit e1aa6f3 into pandas-dev:master Mar 30, 2019
@jreback
Copy link
Contributor

jreback commented Mar 30, 2019

merging this so can start using it. less concerned about the test as its private.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants