Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

itertuples #842

Merged
merged 8 commits into from
Dec 26, 2023
Merged

itertuples #842

merged 8 commits into from
Dec 26, 2023

Conversation

twoertwein
Copy link
Member

@@ -250,7 +250,7 @@ class DataFrame(NDFrame, OpsMixin):
def iterrows(self) -> Iterable[tuple[Hashable, Series]]: ...
def itertuples(
self, index: _bool = ..., name: _str | None = ...
) -> Iterable[tuple[Any, ...]]: ...
) -> Iterable[Any]: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we could do something like this:

class _PandasNamedTuple(tuple[Any,...]):
    def __getattr__(self, field: str) -> Any: ...

Then have itertuples() return Iterable[_PandasNamedTuple]

Then at least you know you're getting a tuple.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice solution!

@@ -20,6 +20,7 @@
Any,
Callable,
Generic,
TypeAlias,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to come from typing_extensions. Failed under python 3.9

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you look at test_types_itertuples() in test_frame.py and see if that should be modified? Maybe introduce check(assert_type in there just to be sure.

@@ -2279,3 +2279,6 @@ class DataFrame(NDFrame, OpsMixin):
) -> Self: ...
def __truediv__(self, other: float | DataFrame | Series | Sequence) -> Self: ...
def __rtruediv__(self, other: float | DataFrame | Series | Sequence) -> Self: ...

class _PandasNamedTuple(tuple[Any, ...]):
def __getattr__(self, field: str) -> Any: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking we should return Scalar here, because we also return Scalar when someone does df.loc[3, "a"]

While it's true that some non-scalar value could be an individual element of a DataFrame, I've taken the philosophy of limiting the types to what is "normal" usage, and if you put a funky type in a DataFrame or Series, then you can do a cast to fix it. I've done that in some of our application code when we have lists or other objects inside a Series or DataFrame.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @twoertwein

@Dr-Irv Dr-Irv merged commit a370cab into pandas-dev:main Dec 26, 2023
13 checks passed
@twoertwein twoertwein deleted the itertuples branch February 10, 2024 20:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

With the most recent pandas-stubs, mypy now complains about every call to itertuples()
2 participants