Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openpyxl: Type usages of PIL and zipfile #10706

Merged
merged 7 commits into from
Nov 2, 2023

Conversation

Avasam
Copy link
Collaborator

@Avasam Avasam commented Sep 12, 2023

Extracted from, and improved over #9511

Typed all usages of non-type, non excel, libraries external to openpyxl: PIL and zipfile.
Completed annotations of affected methods and classes. (unless they got complicated or uncertain). Mainly with the goal of reducing changes in #9511 w/o going too out of scope for this PR.

@@ -44,7 +45,7 @@ class ExcelReader:
def read(self) -> None: ...

def load_workbook(
filename: SupportsRead[bytes] | StrPath,
filename: StrPath | IO[bytes],
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/python/typeshed/pull/9511/files#r1113278179 Avasam

This, and ExcelReader.__init__ is ultimately passed to ZipFile in _validate_archive.
ZipFile.__init__ is typed as ZipFile(file: StrPath | IO[bytes], ...)

Even in read mode it still needs a name variable, seek and tell methods.

AlexWaygood

I'd be okay with switching both of the conflicting annotations to IO[bytes]. We could probably come up with a more precise protocol, but for now consistency with the stdlib zipfile stub is probably the most important thing

Copy link
Collaborator Author

@Avasam Avasam Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is responsible for the mypy_primer diff.
Affected code in Pandas: https://github.com/pandas-dev/pandas/blob/705d4312cf1d94ef2497bcd8091e0eabd1085f4a/pandas/io/excel/_openpyxl.py#L566
If this change is incorrect, then zipfile should probably be updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the primer hit I think we should bite the bullet, introduce a protocol in zipfile, and use an equivalent protocol here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind holding off on this PR until we have a proper Protocol defined for zipfile.

I'll link #5835 as this is another use-case for it. Otherwise we'll have to duplicate the protocol in openpyxl for a while.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think #5835 is happening anytime soon so for now we'll just have to duplicate it.

Comment on lines 44 to 51
@overload
def get_rel(
archive: ZipFile, deps: RelationshipList, id: str, cls: type[_SerialisableT]
) -> _SerialisableT: ... # incomplete: this could be restricted further from "Serialisable"
@overload
def get_rel(
archive: ZipFile, deps: RelationshipList, id: str | None = None, *, cls: type[_SerialisableT]
) -> _SerialisableT: ... # incomplete: this could be restricted further from "Serialisable"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cls is always used at runtime: https://foss.heptapod.net/openpyxl/openpyxl/-/blob/branch/3.1/openpyxl/packaging/relationship.py#L143-150

About the # incomplete: Only CacheDefinition, RecordList and TableDefinition seem to define rel_type. (they also define is as ClassVars, which matches expected behaviour). But I can't find any class defining deps. The attribute seems to only be assigned in get_rel and used in WorkbookParser.pivot_caches. So I'm not exactly sure about which Serialisable child I should restrict to, if any.

Copy link
Collaborator Author

@Avasam Avasam Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I look into it, the more I realize obj.deps is for internal used only and short-lived. Updated commit should be accurate.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

Avasam added a commit to Avasam/typeshed that referenced this pull request Sep 19, 2023
@github-actions

This comment has been minimized.

archive: ZipFile, deps: RelationshipList, id: None, cls: type[_SerialisableRelTypeT]
) -> _SerialisableRelTypeT | None: ...
@overload
def get_rel(archive: ZipFile, deps: RelationshipList, id: str, *, cls: type[_SerialisableT]) -> _SerialisableT: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this redundant with the next overload?

Copy link
Collaborator Author

@Avasam Avasam Sep 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna say you're right? Probably a leftover from having id being optional.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how doable it is, but I suggested catching this at mypy: python/mypy#16164

@github-actions

This comment has been minimized.

@Avasam
Copy link
Collaborator Author

Avasam commented Oct 13, 2023

I abstracted the protocol issue away into an alias that clearly references #10880 and reduces false-positives. This should allow this PR to be merged without waiting for protocol and mypy updates.

@github-actions

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Nov 1, 2023

According to mypy_primer, this change has no effect on the checked open source code. 🤖🎉

Copy link
Collaborator

@srittau srittau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM for now.

@srittau srittau merged commit 105bb0a into python:main Nov 2, 2023
47 checks passed
@Avasam Avasam deleted the openpyxl-external-types branch November 2, 2023 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants