-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
map_groups
no longer works when passing a list of column names to group_by
#14401
Comments
It seems to be hitting this: polars/py-polars/polars/dataframe/group_by.py Lines 317 to 319 in 41b33c5
df.group_by(["color", "shape"]).by
# (['color', 'shape'],) If you drop the list, it works: df.group_by("color", "shape").by
# ('color', 'shape') So that check probably needs some adjusting. |
Yup, that did it. I'll close this to not pollute the issue tracker - the current documentation does show it should be a variable number of arguments rather than a list. That deprecation may have slipped through (or I missed it). Thanks again! |
As far as I can tell, there was no deprecation and it should still work. The group_by docs still contain:
So it does appear to be an accidental behaviour change. Just pinging @deanm0000 / @stinodego as I think they were working on that area. Perhaps they can confirm if this is something that needs to be fixed or not. |
Well this #14099 is still waiting which would more than fix this. The problem is that when adding |
It was - the parsing logic is no longer correct. |
map_groups
no longer works when grouping by multiple expressionsmap_groups
no longer works when passing a list of column names to group_by
Checks
Reproducible example
Log output
Issue description
Between polars
0.20.6
and0.20.7
, some of my complex UDF maps broke. I was able to recreate a simple example above where the code works in the earlier version but not latest.Is this breaking change intentional? I didn't see it highlighted in the release notes, but I could be wrong. I believe it's possible for me to refactor my code to use the more recent documentation's suggestion of
.agg(pl.map_groups())
setup, but it will take a decent amount of work given the first argument/dtype coming in as a list of series instead of a dataframe.Aside: I recognize that these demonstrated aggregations do not actually require a UDF; rest-assured that my actual use case is using third-party Python libraries (scipy, etc.) that do optimization and return result(s) - and therefore require a UDF.
Expected behavior
Same exact code above.
Log output from polars
0.20.6
:Installed versions
The text was updated successfully, but these errors were encountered: