Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow ellipsis to be used in stack #3826

Merged
merged 12 commits into from
Mar 19, 2020
8 changes: 8 additions & 0 deletions doc/reshaping.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,13 @@ implemented :py:meth:`~xarray.DataArray.stack` and
stacked
stacked.unstack('z')

As elsewhere in xarray, an ellipsis (`...`) can be used to represent all unlisted dimensions:

.. ipython:: python

stacked = array.stack(z=[..., "x"])
stacked

These methods are modeled on the :py:class:`pandas.DataFrame` methods of the
same name, although in xarray they always create new dimensions rather than
adding to the existing index or columns.
Expand Down Expand Up @@ -164,6 +171,7 @@ like this:
'b': ('x', [6, 7])},
coords={'y': ['u', 'v', 'w']}
)
data
stacked = data.to_stacked_array("z", sample_dims=['x'])
stacked
unstacked = stacked.to_unstacked_dataset("z")
Expand Down
10 changes: 8 additions & 2 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,19 @@ New Features
By `Justus Magin <https://github.com/keewis>`_.
- :py:meth:`Dataset.groupby` and :py:meth:`DataArray.groupby` now raise a
`TypeError` on multiple string arguments. Receiving multiple string arguments
often means a user is attempting to pass multiple dimensions to group over
and should instead pass a list.
often means a user is attempting to pass multiple dimensions as separate
arguments and should instead pass a single list of dimensions.
(:pull:`3802`)
By `Maximilian Roos <https://github.com/max-sixty>`_
- The new ``Dataset._repr_html_`` and ``DataArray._repr_html_`` (introduced
in 0.14.1) is now on by default. To disable, use
``xarray.set_options(display_style="text")``.
By `Julia Signell <https://github.com/jsignell>`_.
- An ellipsis (``...``) is now supported in the ``dims`` argument of
:py:meth:`Dataset.stack` and :py:meth:`DataArray.stack`, meaning all
unlisted dimensions, similar to its meaning in :py:meth:`DataArray.transpose`.
(:pull:`3826`)
By `Maximilian Roos <https://github.com/max-sixty>`_
- :py:meth:`Dataset.where` and :py:meth:`DataArray.where` accept a lambda as a
first argument, which is then called on the input; replicating pandas' behavior.
By `Maximilian Roos <https://github.com/max-sixty>`_.
Expand Down
4 changes: 3 additions & 1 deletion xarray/core/dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -1709,7 +1709,9 @@ def stack(
----------
dimensions : Mapping of the form new_name=(dim1, dim2, ...)
Names of new dimensions, and the existing dimensions that they
replace.
replace. An ellipsis (`...`) will be replaced by all unlisted dimensions.
Passing a list containing an ellipsis (`stacked_dim=[...]`) will stack over
all dimensions.
**dimensions_kwargs:
The keyword arguments form of ``dimensions``.
One of dimensions or dimensions_kwargs must be provided.
Expand Down
7 changes: 6 additions & 1 deletion xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@
decode_numpy_dict_values,
either_dict_or_kwargs,
hashable,
infix_dims,
is_dict_like,
is_scalar,
maybe_wrap_array,
Expand Down Expand Up @@ -3262,6 +3263,8 @@ def reorder_levels(
return self._replace(variables, indexes=indexes)

def _stack_once(self, dims, new_dim):
if ... in dims:
dims = list(infix_dims(dims, self.dims))
variables = {}
for name, var in self.variables.items():
if name not in dims:
Expand Down Expand Up @@ -3304,7 +3307,9 @@ def stack(
----------
dimensions : Mapping of the form new_name=(dim1, dim2, ...)
Names of new dimensions, and the existing dimensions that they
replace.
replace. An ellipsis (`...`) will be replaced by all unlisted dimensions.
Passing a list containing an ellipsis (`stacked_dim=[...]`) will stack over
all dimensions.
**dimensions_kwargs:
The keyword arguments form of ``dimensions``.
One of dimensions or dimensions_kwargs must be provided.
Expand Down
3 changes: 3 additions & 0 deletions xarray/tests/test_dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -2040,6 +2040,9 @@ def test_stack_unstack(self):
actual = orig.stack(z=["x", "y"]).unstack("z").drop_vars(["x", "y"])
assert_identical(orig, actual)

actual = orig.stack(z=[...]).unstack("z").drop_vars(["x", "y"])
assert_identical(orig, actual)

dims = ["a", "b", "c", "d", "e"]
orig = xr.DataArray(np.random.rand(1, 2, 3, 2, 1), dims=dims)
stacked = orig.stack(ab=["a", "b"], cd=["c", "d"])
Expand Down
11 changes: 11 additions & 0 deletions xarray/tests/test_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -2879,6 +2879,17 @@ def test_stack(self):
actual = ds.stack(z=["x", "y"])
assert_identical(expected, actual)

actual = ds.stack(z=[...])
assert_identical(expected, actual)

# non list dims with ellipsis
actual = ds.stack(z=(...,))
assert_identical(expected, actual)

# ellipsis with given dim
actual = ds.stack(z=[..., "y"])
assert_identical(expected, actual)

exp_index = pd.MultiIndex.from_product([["a", "b"], [0, 1]], names=["y", "x"])
expected = Dataset(
{"a": ("z", [0, 1, 0, 1]), "b": ("z", [0, 2, 1, 3]), "z": exp_index}
Expand Down