-
Notifications
You must be signed in to change notification settings - Fork 921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for group_keys
in groupby
#11659
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-22.10 #11659 +/- ##
===============================================
Coverage ? 86.42%
===============================================
Files ? 145
Lines ? 23000
Branches ? 0
===============================================
Hits ? 19877
Misses ? 3123
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks fine to me! Multi-indexing is always complicated but I'm happy with this approach. Thank you for the helpful notes in the docs, as well.
@gpucibot merge |
This PR introduces `pandas-1.5` support in `cudf`. The changes include: - [x] Requires `group_keys` support in `groupby` for `dask_cudf` to work: #11659 - [x] Requires `zfill` updates to match `pandas-1.5` behavior: #11634 - [x] `where` API: Ability to inspect a scalar value if it can be fit into the existing dtype, similar to: pandas-dev/pandas#48373 - [x] Switches `ValueError` to `TypeError` when an unknown category is being set to a `CategoricalColumn` - [x] Handles breaking change of an `ArrowIntervalType` related import that has resulted in `cudf` to error on import itself. - [x] Fix an issue with `IntervalColumn.to_pandas`. - [x] Raises error when an object of `boolean` dtype is being set to a `NumericalColumn`. - [x] Raises error when `pat` is None in `Series.str.startswith` & `Series.str.endswith`. - [x] Add `IntervalDtype.to_pandas` with appropriate versioning. - [x] Handle `get_window_bounds` signature changes. - [x] Fix and version a bunch of pytests. ```python branch-22.10: == 4275 failed, 79837 passed, 2049 skipped, 1193 xfailed, 1923 xpassed, 6597 warnings, 4 errors in 1103.52s (0:18:23) == == 803 failed, 106 passed, 14 skipped, 14 xfailed, 324 warnings, 17 errors in 148.46s (0:02:28) == This PR: == 84041 passed, 2049 skipped, 1199 xfailed, 1710 xpassed, 6599 warnings in 359.27s (0:05:59) == == 954 passed, 14 skipped, 7 xfailed, 3 xpassed, 580 warnings in 54.75s == ``` Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Ashwin Srinath (https://github.com/shwina) - Matthew Roeschke (https://github.com/mroeschke) - Mark Sadang (https://github.com/msadang) URL: #11617
Description
group_keys
ingroupby
. Starting pandas 1.5.0, issues aroundgroup_keys
have been resolved:pandas-dev/pandas#34998
pandas-dev/pandas#47185
group_keys
toFalse
which is the same as what pandas is going to be defaulting to in the future version.pandas-1.5.0
upgrade in cudf: [REVIEW] Upgradepandas
to1.5
#11617Checklist