-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: pandas 2.1.2 changes how copy
works
#55763
Comments
We’re pinning Pandas to !=2.1.2. Can we rely on Pandas 2.2.0 to fix this? |
I believe this was specifically caused by the changes to: def _constructor_from_mgr(self, mgr, axes):
df = self._from_mgr(mgr, axes=axes)
if type(self) is DataFrame:
# fastpath avoiding constructor call
return df
else:
assert axes is mgr.axes
return self._constructor(df, copy=False) But now it looks like: def _constructor_from_mgr(self, mgr, axes):
if self._constructor is DataFrame:
# we are pandas.DataFrame (or a subclass that doesn't override _constructor)
return self._from_mgr(mgr, axes=axes)
else:
assert axes is mgr.axes
return self._constructor(mgr) Behaviour changed for other methods, like I believe we were relying on the constructor being called for our subclass, since we want to coerce to a I've opened a draft PR which may fix this: This PR reverts the branching logic to check the class being passed in instead of the class of Would appreciate a look from @jorisvandenbossche. |
Thanks for the detailed report and analysis, @flying-sheep and @ivirshup ! But yes, this is something we definitely should fix. I didn't consider the case of subclasses that don't get preserved but always return a plain pandas object for each operation. |
### What changes were proposed in this pull request? Upgrade pandas from 2.1.2 to 2.1.3 ### Why are the changes needed? Fixed infinite recursion from operations that return a new object on some DataFrame subclasses ([GH 55763](pandas-dev/pandas#55763)) and Fix [read_parquet()](https://pandas.pydata.org/docs/reference/api/pandas.read_parquet.html#pandas.read_parquet) and [read_feather()](https://pandas.pydata.org/docs/reference/api/pandas.read_feather.html#pandas.read_feather) for [CVE-2023-47248](https://www.cve.org/CVERecord?id=CVE-2023-47248) ([GH 55894](pandas-dev/pandas#55894)) [Release notes for 2.1.3](https://pandas.pydata.org/docs/whatsnew/v2.1.3.html) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA ### Was this patch authored or co-authored using generative AI tooling? No. Closes #43822 from bjornjorgensen/pandas-2_1_3. Authored-by: Bjørn Jørgensen <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
Pandas version checks
Reproducible Example
Issue Description
Since very long ago, calling
.copy()
on a DataFrame subclass returned a DataFrame object. This should be changed in a major release, not a feature release, and definitely not a patch release.Expected Behavior
The above assert statement to succeed
See also
__getattr__
inv2.1.0
#55120Installed Versions
pandas 2.1.2 or 2.2.0.dev0+447
The text was updated successfully, but these errors were encountered: