-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
openpyxl: Type usages of PIL
and zipfile
#10706
Conversation
@@ -44,7 +45,7 @@ class ExcelReader: | |||
def read(self) -> None: ... | |||
|
|||
def load_workbook( | |||
filename: SupportsRead[bytes] | StrPath, | |||
filename: StrPath | IO[bytes], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/python/typeshed/pull/9511/files#r1113278179 Avasam
This, and
ExcelReader.__init__
is ultimately passed toZipFile
in_validate_archive
.
ZipFile.__init__
is typed asZipFile(file: StrPath | IO[bytes], ...)
Even in read mode it still needs a
name
variable,seek
andtell
methods.
AlexWaygood
I'd be okay with switching both of the conflicting annotations to
IO[bytes]
. We could probably come up with a more precise protocol, but for now consistency with the stdlibzipfile
stub is probably the most important thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is responsible for the mypy_primer
diff.
Affected code in Pandas: https://github.com/pandas-dev/pandas/blob/705d4312cf1d94ef2497bcd8091e0eabd1085f4a/pandas/io/excel/_openpyxl.py#L566
If this change is incorrect, then zipfile
should probably be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the primer hit I think we should bite the bullet, introduce a protocol in zipfile, and use an equivalent protocol here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind holding off on this PR until we have a proper Protocol defined for zipfile.
I'll link #5835 as this is another use-case for it. Otherwise we'll have to duplicate the protocol in openpyxl for a while.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think #5835 is happening anytime soon so for now we'll just have to duplicate it.
@overload | ||
def get_rel( | ||
archive: ZipFile, deps: RelationshipList, id: str, cls: type[_SerialisableT] | ||
) -> _SerialisableT: ... # incomplete: this could be restricted further from "Serialisable" | ||
@overload | ||
def get_rel( | ||
archive: ZipFile, deps: RelationshipList, id: str | None = None, *, cls: type[_SerialisableT] | ||
) -> _SerialisableT: ... # incomplete: this could be restricted further from "Serialisable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cls
is always used at runtime: https://foss.heptapod.net/openpyxl/openpyxl/-/blob/branch/3.1/openpyxl/packaging/relationship.py#L143-150
About the # incomplete
: Only CacheDefinition
, RecordList
and TableDefinition
seem to define rel_type
. (they also define is as ClassVars, which matches expected behaviour). But I can't find any class defining deps
. The attribute seems to only be assigned in get_rel
and used in WorkbookParser.pivot_caches
. So I'm not exactly sure about which Serialisable
child I should restrict to, if any.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I look into it, the more I realize obj.deps
is for internal used only and short-lived. Updated commit should be accurate.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
archive: ZipFile, deps: RelationshipList, id: None, cls: type[_SerialisableRelTypeT] | ||
) -> _SerialisableRelTypeT | None: ... | ||
@overload | ||
def get_rel(archive: ZipFile, deps: RelationshipList, id: str, *, cls: type[_SerialisableT]) -> _SerialisableT: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this redundant with the next overload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanna say you're right? Probably a leftover from having id
being optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how doable it is, but I suggested catching this at mypy: python/mypy#16164
This comment has been minimized.
This comment has been minimized.
I abstracted the protocol issue away into an alias that clearly references #10880 and reduces false-positives. This should allow this PR to be merged without waiting for protocol and mypy updates. |
This comment has been minimized.
This comment has been minimized.
…xl-external-types
According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! LGTM for now.
Extracted from, and improved over #9511
Typed all usages of non-type, non excel, libraries external to openpyxl:
PIL
andzipfile
.Completed annotations of affected methods and classes. (unless they got complicated or uncertain). Mainly with the goal of reducing changes in #9511 w/o going too out of scope for this PR.