-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save groupby codes after factorizing, pass to flox #7206
Conversation
@@ -388,63 +483,20 @@ def __init__( | |||
"dimension" | |||
) | |||
|
|||
full_index = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just refactored with no functionality changes except to return codes
1891fdb
to
ac521d4
Compare
ac521d4
to
b64df5b
Compare
Fix cftime resampling
* upstream/main: (39 commits) Support the new compression argument in netCDF4 > 1.6.0 (pydata#6981) Remove setuptools-scm-git-archive, require setuptools-scm>=7 (pydata#7253) Fix mypy failures (pydata#7343) Docs: add example of writing and reading groups to netcdf (pydata#7338) Reset file pointer to 0 when reading file stream (pydata#7304) Enable mypy warn unused ignores (pydata#7335) Optimize some copying (pydata#7209) Add parse_dims func (pydata#7051) Fix coordinate attr handling in `xr.where(..., keep_attrs=True)` (pydata#7229) Remove code used to support h5py<2.10.0 (pydata#7334) [pre-commit.ci] pre-commit autoupdate (pydata#7330) Fix PR number in what’s new (pydata#7331) Enable `origin` and `offset` arguments in `resample` (pydata#7284) fix doctests: supress urllib3 warning (pydata#7326) fix flake8 config (pydata#7321) implement Zarr v3 spec support (pydata#6475) Fix polyval overloads (pydata#7315) deprecate pynio backend (pydata#7301) mypy - Remove some ignored packages and modules (pydata#7319) Switch to T_DataArray in .coords (pydata#7285) ...
* main: absolufy-imports - No relative imports - PEP8 (pydata#7204) [skip-ci] whats-new for dev (pydata#7351) Whats-new: 2022.12.0 (pydata#7345) Fix assign_coords resetting all dimension coords to default index (pydata#7347)
* main: Preserve `base` and `loffset` arguments in `resample` (pydata#7444) ignore the `pkg_resources` deprecation warning (pydata#7594) Update contains_cftime_datetimes to avoid loading entire variable array (pydata#7494) Support first, last with dask arrays (pydata#7562) update the docs environment (pydata#7442) Add xCDAT to list of Xarray related projects (pydata#7579) [pre-commit.ci] pre-commit autoupdate (pydata#7565) fix nczarr when libnetcdf>4.8.1 (pydata#7575) use numpys SupportsDtype (pydata#7521)
2d3f984
to
3f4dde2
Compare
I'd like to merge this soon. It's mostly a refactor moving code around, and a major improvement to the flox code path. Let me know if it'd be easier to split it in to two PRs |
Hmm, did I mess something up? I believe I only changed typing related things. _bins suffix seems to have disappeared:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @dcherian. This LGTM. :)
* upstream/main: Save groupby codes after factorizing, pass to flox (pydata#7206) [skip-ci] Add compute to groupby benchmarks (pydata#7690) Delete built-in cfgrib backend (pydata#7670) Added a pronunciation guide to the word Xarray in the README.MD fil… (pydata#7677) boundarynorm fix (pydata#7553) Fix lazy negative slice rewriting. (pydata#7586) [pre-commit.ci] pre-commit autoupdate (pydata#7687) Adjust sidebar font colors (pydata#7674) Bump pypa/gh-action-pypi-publish from 1.8.1 to 1.8.3 (pydata#7682) Raise PermissionError when insufficient permissions (pydata#7629)
This is an alternative to #6689.
by
variable directly toflox
. Most GroupBy methods however depend on various steps in__init__
, so it became messy.flox
. This simplifies things a lot. Since we'll want to preserve the "for loop over groups" approach forGroupBy.map
, we'll need something like this anyway.The large amount of deleted code in
_flox_reduce
here suggests to me that this is the better approach.I think we could also use this to generalize to multiple groupers:
ravel_muti_index
to generate a singe variable to groupby