-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial pandas.typing Module #25884
Initial pandas.typing Module #25884
Conversation
cool, can you use these in a couple of places? and some examples / docs-strings? |
Codecov Report
@@ Coverage Diff @@
## master #25884 +/- ##
==========================================
- Coverage 91.48% 91.46% -0.02%
==========================================
Files 175 176 +1
Lines 52885 52892 +7
==========================================
Hits 48380 48380
- Misses 4505 4512 +7
Continue to review full report at Codecov.
|
1 similar comment
Codecov Report
@@ Coverage Diff @@
## master #25884 +/- ##
==========================================
- Coverage 91.48% 91.46% -0.02%
==========================================
Files 175 176 +1
Lines 52885 52892 +7
==========================================
Hits 48380 48380
- Misses 4505 4512 +7
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #25884 +/- ##
==========================================
- Coverage 91.8% 91.79% -0.01%
==========================================
Files 174 175 +1
Lines 52536 52540 +4
==========================================
Hits 48231 48231
- Misses 4305 4309 +4
Continue to review full report at Codecov.
|
pandas/typing.py
Outdated
from pandas._libs.tslibs import Period, NaT, Timedelta, Timestamp | ||
|
||
|
||
DateTimeLike = Union[datetime, timedelta, Period, Timedelta, Timestamp] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do/should the numpy versions go in here?
I don’t think so but not sure I follow - what’s the link between NumPy version and types?
…Sent from my iPhone
On Mar 26, 2019, at 5:05 PM, jbrockmendel ***@***.***> wrote:
@jbrockmendel commented on this pull request.
In pandas/typing.py:
> @@ -0,0 +1,14 @@
+from datetime import datetime, timedelta
+from pathlib import Path
+from typing import AnyStr, IO, Union
+
+from pandas._libs.tslibs import Period, NaT, Timedelta, Timestamp
+
+
+DateTimeLike = Union[datetime, timedelta, Period, Timedelta, Timestamp]
Do/should the numpy versions go in here?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Bad phrasing on my part. I was referring to np.datetime64 and np.timedelta64 objects. Many of our functions that accept datetime inputs will also accept np.datetime64 inputs. Ditto for timedelta. |
Sure just added the FilePathOrBuffer type to io.parsers. CI will fail with the following: pandas/io/parsers.py:445: error: Item "str" of "Union[str, Path, IO[Any]]" has no attribute "close"
pandas/io/parsers.py:445: error: Item "Path" of "Union[str, Path, IO[Any]]" has no attribute "close" There's an implicit assumption in the codebase that a local variable @jbrockmendel any thoughts on where best to try out the DateTimeLike types? |
Not really. I guess I'm not clear on the use case. Aside from running mypy, are we actually doing anything with these? |
Use case would be to improve readability and maintainability of code base and in theory reducing bugs. There's some refactor that will go hand in hand with it (as you might have seen with some of the IO stuff above). This is just starting things off and there's a long way to go. I think I might split off the datetimelike stuff for this for now and just focus on IO. Can see how that goes and learn from there |
# Union[FilePathOrBuffer, s3fs.S3File, gcsfs.GCSFile] | ||
# though mypy handling of conditional imports is difficult. | ||
# See https://github.com/python/mypy/issues/1297 | ||
fp_or_buf, _, compression, should_close = get_filepath_or_buffer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's mentioned in the comments but I changed the variable name here from filepath_or_buffer
to fp_or_buf
to intentionally NOT shadow the parameter from the signature.
As mentioned in the comment this local variable could potentially introduce new types for S3 and GCP and I don't think there is a great way with typing to statically analyze conditional imports like those just yet, so it's a clearer delimitation IMO to assign the return of this function to a separate variable
see some suggestions: #25829 (can be a followup) |
looks fine. better to start small. merge if you like (or add more up 2 you) |
doc/source/whatsnew/v0.25.0.rst
Outdated
Type Hints and ``pandas.typing`` | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
In accordance with :pep:`484` pandas has introduced Type Hints and a new ``pandas.typing`` module containing aliases for idiomatic pandas types into the code base. We will be continually adding annotations to the code base to improve readability, reduce code maintenance and proactively identify bugs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we make this public?
(which mean: maintaining compatibility) Maybe at least not initially?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I don't see a downside to making private. Can do that on next iteration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK just made this private and updated documentation - lmk what you think
doc/source/whatsnew/v0.25.0.rst
Outdated
|
||
.. _whatsnew_0250.enhancements.typing: | ||
|
||
Type Hints and ``pandas.typing`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would remove this entirely. if its private it is not to be relied upon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs a tests as indicated, other lgtm. ping on green.
@@ -0,0 +1,4 @@ | |||
from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test that asserts this in pandas/tests/types/test_api.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea no problem. So the existing test(s) exclude any privately named modules:
pandas/pandas/tests/api/test_api.py
Line 14 in 2792705
result = sorted(f for f in dir(namespace) if not f.startswith('_')) |
Are you asking to revisit that logic or simply add a test with this as an exception to make sure it lives there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh i c. nvm then. though I think we should actually check and lock down all modules (so add back the private ones). in a new PR / issue though.
merging this so can start using it. less concerned about the test as its private. |
ref #25791 maybe more of a conversation starter than complete PR but wanted to stub out some types that appear commonly in pandas code-base.
Haven't used these yet as wanted to agree on type first. I wouldn't be surprised if we start seeing some issues pop up that would require real development changes once these get utilized
@jreback @jbrockmendel