Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from pydata:main #543

Merged
merged 2 commits into from
Nov 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -557,6 +557,7 @@ Datetimelike properties
DataArray.dt.seconds
DataArray.dt.microseconds
DataArray.dt.nanoseconds
DataArray.dt.total_seconds

**Timedelta methods**:

Expand Down Expand Up @@ -602,7 +603,7 @@ Dataset methods
Dataset.as_numpy
Dataset.from_dataframe
Dataset.from_dict
Dataset.to_array
Dataset.to_dataarray
Dataset.to_dataframe
Dataset.to_dask_dataframe
Dataset.to_dict
Expand Down
2 changes: 1 addition & 1 deletion doc/howdoi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ How do I ...
* - rename a variable, dimension or coordinate
- :py:meth:`Dataset.rename`, :py:meth:`DataArray.rename`, :py:meth:`Dataset.rename_vars`, :py:meth:`Dataset.rename_dims`,
* - convert a DataArray to Dataset or vice versa
- :py:meth:`DataArray.to_dataset`, :py:meth:`Dataset.to_array`, :py:meth:`Dataset.to_stacked_array`, :py:meth:`DataArray.to_unstacked_dataset`
- :py:meth:`DataArray.to_dataset`, :py:meth:`Dataset.to_dataarray`, :py:meth:`Dataset.to_stacked_array`, :py:meth:`DataArray.to_unstacked_dataset`
* - extract variables that have certain attributes
- :py:meth:`Dataset.filter_by_attrs`
* - extract the underlying array (e.g. NumPy or Dask arrays)
Expand Down
12 changes: 6 additions & 6 deletions doc/user-guide/reshaping.rst
Original file line number Diff line number Diff line change
Expand Up @@ -59,11 +59,11 @@ use :py:meth:`~xarray.DataArray.squeeze`
Converting between datasets and arrays
--------------------------------------

To convert from a Dataset to a DataArray, use :py:meth:`~xarray.Dataset.to_array`:
To convert from a Dataset to a DataArray, use :py:meth:`~xarray.Dataset.to_dataarray`:

.. ipython:: python

arr = ds.to_array()
arr = ds.to_dataarray()
arr

This method broadcasts all data variables in the dataset against each other,
Expand All @@ -77,7 +77,7 @@ To convert back from a DataArray to a Dataset, use

arr.to_dataset(dim="variable")

The broadcasting behavior of ``to_array`` means that the resulting array
The broadcasting behavior of ``to_dataarray`` means that the resulting array
includes the union of data variable dimensions:

.. ipython:: python
Expand All @@ -88,7 +88,7 @@ includes the union of data variable dimensions:
ds2

# the resulting array has 6 elements
ds2.to_array()
ds2.to_dataarray()

Otherwise, the result could not be represented as an orthogonal array.

Expand Down Expand Up @@ -161,8 +161,8 @@ arrays as inputs. For datasets with only one variable, we only need ``stack``
and ``unstack``, but combining multiple variables in a
:py:class:`xarray.Dataset` is more complicated. If the variables in the dataset
have matching numbers of dimensions, we can call
:py:meth:`~xarray.Dataset.to_array` and then stack along the the new coordinate.
But :py:meth:`~xarray.Dataset.to_array` will broadcast the dataarrays together,
:py:meth:`~xarray.Dataset.to_dataarray` and then stack along the the new coordinate.
But :py:meth:`~xarray.Dataset.to_dataarray` will broadcast the dataarrays together,
which will effectively tile the lower dimensional variable along the missing
dimensions. The method :py:meth:`xarray.Dataset.to_stacked_array` allows
combining variables of differing dimensions without this wasteful copying while
Expand Down
14 changes: 11 additions & 3 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ New Features

- Use `opt_einsum <https://optimized-einsum.readthedocs.io/en/stable/>`_ for :py:func:`xarray.dot` by default if installed.
By `Deepak Cherian <https://github.com/dcherian>`_. (:issue:`7764`, :pull:`8373`).
- Add ``DataArray.dt.total_seconds()`` method to match the Pandas API. (:pull:`8435`).
By `Ben Mares <https://github.com/maresb>`_.

Breaking changes
~~~~~~~~~~~~~~~~
Expand All @@ -39,6 +41,12 @@ Deprecations
this was one place in the API where dimension positions were used.
(:pull:`8341`)
By `Maximilian Roos <https://github.com/max-sixty>`_.
- Rename :py:meth:`Dataset.to_array` to :py:meth:`Dataset.to_dataarray` for
consistency with :py:meth:`DataArray.to_dataset` &
:py:func:`open_dataarray` functions. This is a "soft" deprecation — the
existing methods work and don't raise any warnings, given the relatively small
benefits of the change.
By `Maximilian Roos <https://github.com/max-sixty>`_.

Bug fixes
~~~~~~~~~
Expand Down Expand Up @@ -6707,7 +6715,7 @@ Backwards incompatible changes
Enhancements
~~~~~~~~~~~~

- New ``xray.Dataset.to_array`` and enhanced
- New ``xray.Dataset.to_dataarray`` and enhanced
``xray.DataArray.to_dataset`` methods make it easy to switch back
and forth between arrays and datasets:

Expand All @@ -6718,8 +6726,8 @@ Enhancements
coords={"c": 42},
attrs={"Conventions": "None"},
)
ds.to_array()
ds.to_array().to_dataset(dim="variable")
ds.to_dataarray()
ds.to_dataarray().to_dataset(dim="variable")

- New ``xray.Dataset.fillna`` method to fill missing values, modeled
off the pandas method of the same name:
Expand Down
14 changes: 14 additions & 0 deletions xarray/core/accessor_dt.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,8 @@ def _access_through_series(values, name):
if name == "season":
months = values_as_series.dt.month.values
field_values = _season_from_months(months)
elif name == "total_seconds":
field_values = values_as_series.dt.total_seconds().values
elif name == "isocalendar":
# special NaT-handling can be removed when
# https://github.com/pandas-dev/pandas/issues/54657 is resolved
Expand Down Expand Up @@ -574,6 +576,13 @@ class TimedeltaAccessor(TimeAccessor[T_DataArray]):
43200, 64800])
Coordinates:
* time (time) timedelta64[ns] 1 days 00:00:00 ... 5 days 18:00:00
>>> ts.dt.total_seconds()
<xarray.DataArray 'total_seconds' (time: 20)>
array([ 86400., 108000., 129600., 151200., 172800., 194400., 216000.,
237600., 259200., 280800., 302400., 324000., 345600., 367200.,
388800., 410400., 432000., 453600., 475200., 496800.])
Coordinates:
* time (time) timedelta64[ns] 1 days 00:00:00 ... 5 days 18:00:00
"""

@property
Expand All @@ -596,6 +605,11 @@ def nanoseconds(self) -> T_DataArray:
"""Number of nanoseconds (>= 0 and less than 1 microsecond) for each element"""
return self._date_field("nanoseconds", np.int64)

# Not defined as a property in order to match the Pandas API
def total_seconds(self) -> T_DataArray:
"""Total duration of each element expressed in seconds."""
return self._date_field("total_seconds", np.float64)


class CombinedDatetimelikeAccessor(
DatetimeAccessor[T_DataArray], TimedeltaAccessor[T_DataArray]
Expand Down
2 changes: 1 addition & 1 deletion xarray/core/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -1173,7 +1173,7 @@ def _dataset_indexer(dim: Hashable) -> DataArray:
var for var in cond if dim not in cond[var].dims
)
keepany = cond_wdim.any(dim=(d for d in cond.dims.keys() if d != dim))
return keepany.to_array().any("variable")
return keepany.to_dataarray().any("variable")

_get_indexer = (
_dataarray_indexer if isinstance(cond, DataArray) else _dataset_indexer
Expand Down
4 changes: 3 additions & 1 deletion xarray/core/computation.py
Original file line number Diff line number Diff line change
Expand Up @@ -1603,7 +1603,9 @@ def cross(
>>> ds_a = xr.Dataset(dict(x=("dim_0", [1]), y=("dim_0", [2]), z=("dim_0", [3])))
>>> ds_b = xr.Dataset(dict(x=("dim_0", [4]), y=("dim_0", [5]), z=("dim_0", [6])))
>>> c = xr.cross(
... ds_a.to_array("cartesian"), ds_b.to_array("cartesian"), dim="cartesian"
... ds_a.to_dataarray("cartesian"),
... ds_b.to_dataarray("cartesian"),
... dim="cartesian",
... )
>>> c.to_dataset(dim="cartesian")
<xarray.Dataset>
Expand Down
14 changes: 10 additions & 4 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -1502,7 +1502,7 @@ def __array__(self, dtype=None):
"cannot directly convert an xarray.Dataset into a "
"numpy array. Instead, create an xarray.DataArray "
"first, either with indexing on the Dataset or by "
"invoking the `to_array()` method."
"invoking the `to_dataarray()` method."
)

@property
Expand Down Expand Up @@ -5260,7 +5260,7 @@ def to_stacked_array(
"""Combine variables of differing dimensionality into a DataArray
without broadcasting.

This method is similar to Dataset.to_array but does not broadcast the
This method is similar to Dataset.to_dataarray but does not broadcast the
variables.

Parameters
Expand Down Expand Up @@ -5289,7 +5289,7 @@ def to_stacked_array(

See Also
--------
Dataset.to_array
Dataset.to_dataarray
Dataset.stack
DataArray.to_unstacked_dataset

Expand Down Expand Up @@ -7019,7 +7019,7 @@ def assign(

return data

def to_array(
def to_dataarray(
self, dim: Hashable = "variable", name: Hashable | None = None
) -> DataArray:
"""Convert this dataset into an xarray.DataArray
Expand Down Expand Up @@ -7056,6 +7056,12 @@ def to_array(

return DataArray._construct_direct(variable, coords, name, indexes)

def to_array(
self, dim: Hashable = "variable", name: Hashable | None = None
) -> DataArray:
"""Deprecated version of to_dataarray"""
return self.to_dataarray(dim=dim, name=name)

def _normalize_dim_order(
self, dim_order: Sequence[Hashable] | None = None
) -> dict[Hashable, int]:
Expand Down
4 changes: 4 additions & 0 deletions xarray/core/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,10 @@ def to_dataarray(self) -> DataArray:
data=self.data, dims=(self.name,), coords=self.coords, name=self.name
)

def to_array(self) -> DataArray:
"""Deprecated version of to_dataarray."""
return self.to_dataarray()


T_Group = Union["T_DataArray", "IndexVariable", _DummyGroup]

Expand Down
14 changes: 14 additions & 0 deletions xarray/tests/test_accessor_dt.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

import xarray as xr
from xarray.tests import (
assert_allclose,
assert_array_equal,
assert_chunks_equal,
assert_equal,
Expand Down Expand Up @@ -100,6 +101,19 @@ def test_field_access(self, field) -> None:
assert expected.dtype == actual.dtype
assert_identical(expected, actual)

def test_total_seconds(self) -> None:
# Subtract a value in the middle of the range to ensure that some values
# are negative
delta = self.data.time - np.datetime64("2000-01-03")
actual = delta.dt.total_seconds()
expected = xr.DataArray(
np.arange(-48, 52, dtype=np.float64) * 3600,
name="total_seconds",
coords=[self.data.time],
)
# This works with assert_identical when pandas is >=1.5.0.
assert_allclose(expected, actual)

@pytest.mark.parametrize(
"field, pandas_field",
[
Expand Down
8 changes: 4 additions & 4 deletions xarray/tests/test_concat.py
Original file line number Diff line number Diff line change
Expand Up @@ -1070,10 +1070,10 @@ def test_concat_fill_value(self, fill_value) -> None:
def test_concat_join_kwarg(self) -> None:
ds1 = Dataset(
{"a": (("x", "y"), [[0]])}, coords={"x": [0], "y": [0]}
).to_array()
).to_dataarray()
ds2 = Dataset(
{"a": (("x", "y"), [[0]])}, coords={"x": [1], "y": [0.0001]}
).to_array()
).to_dataarray()

expected: dict[JoinOptions, Any] = {}
expected["outer"] = Dataset(
Expand Down Expand Up @@ -1101,7 +1101,7 @@ def test_concat_join_kwarg(self) -> None:

for join in expected:
actual = concat([ds1, ds2], join=join, dim="x")
assert_equal(actual, expected[join].to_array())
assert_equal(actual, expected[join].to_dataarray())

def test_concat_combine_attrs_kwarg(self) -> None:
da1 = DataArray([0], coords=[("x", [0])], attrs={"b": 42})
Expand Down Expand Up @@ -1224,7 +1224,7 @@ def test_concat_preserve_coordinate_order() -> None:

def test_concat_typing_check() -> None:
ds = Dataset({"foo": 1}, {"bar": 2})
da = Dataset({"foo": 3}, {"bar": 4}).to_array(dim="foo")
da = Dataset({"foo": 3}, {"bar": 4}).to_dataarray(dim="foo")

# concatenate a list of non-homogeneous types must raise TypeError
with pytest.raises(
Expand Down
16 changes: 8 additions & 8 deletions xarray/tests/test_dask.py
Original file line number Diff line number Diff line change
Expand Up @@ -608,11 +608,11 @@ def test_to_dataset_roundtrip(self):
v = self.lazy_array

expected = u.assign_coords(x=u["x"])
self.assertLazyAndEqual(expected, v.to_dataset("x").to_array("x"))
self.assertLazyAndEqual(expected, v.to_dataset("x").to_dataarray("x"))

def test_merge(self):
def duplicate_and_merge(array):
return xr.merge([array, array.rename("bar")]).to_array()
return xr.merge([array, array.rename("bar")]).to_dataarray()

expected = duplicate_and_merge(self.eager_array)
actual = duplicate_and_merge(self.lazy_array)
Expand Down Expand Up @@ -1306,12 +1306,12 @@ def test_map_blocks_kwargs(obj):
assert_identical(actual, expected)


def test_map_blocks_to_array(map_ds):
def test_map_blocks_to_dataarray(map_ds):
with raise_if_dask_computes():
actual = xr.map_blocks(lambda x: x.to_array(), map_ds)
actual = xr.map_blocks(lambda x: x.to_dataarray(), map_ds)

# to_array does not preserve name, so cannot use assert_identical
assert_equal(actual, map_ds.to_array())
# to_dataarray does not preserve name, so cannot use assert_identical
assert_equal(actual, map_ds.to_dataarray())


@pytest.mark.parametrize(
Expand Down Expand Up @@ -1376,8 +1376,8 @@ def test_map_blocks_template_convert_object():
assert_identical(actual, template)

ds = da.to_dataset()
func = lambda x: x.to_array().isel(x=[1])
template = ds.to_array().isel(x=[1, 5, 9])
func = lambda x: x.to_dataarray().isel(x=[1])
template = ds.to_dataarray().isel(x=[1, 5, 9])
with raise_if_dask_computes():
actual = xr.map_blocks(func, ds, template=template)
assert_identical(actual, template)
Expand Down
4 changes: 2 additions & 2 deletions xarray/tests/test_dataarray.py
Original file line number Diff line number Diff line change
Expand Up @@ -3801,7 +3801,7 @@ def test_to_dataset_split(self) -> None:
with pytest.raises(TypeError):
array.to_dataset("x", name="foo")

roundtripped = actual.to_array(dim="x")
roundtripped = actual.to_dataarray(dim="x")
assert_identical(array, roundtripped)

array = DataArray([1, 2, 3], dims="x")
Expand All @@ -3818,7 +3818,7 @@ def test_to_dataset_retains_keys(self) -> None:
array = DataArray([1, 2, 3], coords=[("x", dates)], attrs={"a": 1})

# convert to dateset and back again
result = array.to_dataset("x").to_array(dim="x")
result = array.to_dataset("x").to_dataarray(dim="x")

assert_equal(array, result)

Expand Down
6 changes: 3 additions & 3 deletions xarray/tests/test_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -4569,7 +4569,7 @@ def test_squeeze_drop(self) -> None:
selected = data.squeeze(drop=True)
assert_identical(data, selected)

def test_to_array(self) -> None:
def test_to_dataarray(self) -> None:
ds = Dataset(
{"a": 1, "b": ("x", [1, 2, 3])},
coords={"c": 42},
Expand All @@ -4579,10 +4579,10 @@ def test_to_array(self) -> None:
coords = {"c": 42, "variable": ["a", "b"]}
dims = ("variable", "x")
expected = DataArray(data, coords, dims, attrs=ds.attrs)
actual = ds.to_array()
actual = ds.to_dataarray()
assert_identical(expected, actual)

actual = ds.to_array("abc", name="foo")
actual = ds.to_dataarray("abc", name="foo")
expected = expected.rename({"variable": "abc"}).rename("foo")
assert_identical(expected, actual)

Expand Down
6 changes: 3 additions & 3 deletions xarray/tests/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -600,19 +600,19 @@ def test_groupby_grouping_errors() -> None:
with pytest.raises(
ValueError, match=r"None of the data falls within bins with edges"
):
dataset.to_array().groupby_bins("x", bins=[0.1, 0.2, 0.3])
dataset.to_dataarray().groupby_bins("x", bins=[0.1, 0.2, 0.3])

with pytest.raises(ValueError, match=r"All bin edges are NaN."):
dataset.groupby_bins("x", bins=[np.nan, np.nan, np.nan])

with pytest.raises(ValueError, match=r"All bin edges are NaN."):
dataset.to_array().groupby_bins("x", bins=[np.nan, np.nan, np.nan])
dataset.to_dataarray().groupby_bins("x", bins=[np.nan, np.nan, np.nan])

with pytest.raises(ValueError, match=r"Failed to group data."):
dataset.groupby(dataset.foo * np.nan)

with pytest.raises(ValueError, match=r"Failed to group data."):
dataset.to_array().groupby(dataset.foo * np.nan)
dataset.to_dataarray().groupby(dataset.foo * np.nan)


def test_groupby_reduce_dimension_error(array) -> None:
Expand Down
2 changes: 1 addition & 1 deletion xarray/tests/test_rolling.py
Original file line number Diff line number Diff line change
Expand Up @@ -631,7 +631,7 @@ def test_rolling_construct(self, center: bool, window: int) -> None:
ds_rolling_mean = ds_rolling.construct("window", stride=2, fill_value=0.0).mean(
"window"
)
assert (ds_rolling_mean.isnull().sum() == 0).to_array(dim="vars").all()
assert (ds_rolling_mean.isnull().sum() == 0).to_dataarray(dim="vars").all()
assert (ds_rolling_mean["x"] == 0.0).sum() >= 0

@pytest.mark.parametrize("center", (True, False))
Expand Down
Loading
Loading