-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fill_value in shift #2470
fill_value in shift #2470
Conversation
Hello @max-sixty! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on December 27, 2018 at 22:49 Hours UTC |
xarray/core/dataarray.py
Outdated
ds = self._to_temp_dataset().shift(shifts=shifts, **shifts_kwargs) | ||
return self._from_temp_dataset(ds) | ||
return self._replace(variable=self.variable.shift( | ||
shifts=shifts, fill_value=fill_value, **shifts_kwargs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a reasonable change for DataArray
operations that only change their data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is 100% equivalent -- I don't really care either way. But perhaps splitting this into two statements would be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great -- I have only a few minor points.
xarray/core/dataarray.py
Outdated
ds = self._to_temp_dataset().shift(shifts=shifts, **shifts_kwargs) | ||
return self._from_temp_dataset(ds) | ||
return self._replace(variable=self.variable.shift( | ||
shifts=shifts, fill_value=fill_value, **shifts_kwargs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is 100% equivalent -- I don't really care either way. But perhaps splitting this into two statements would be clearer.
xarray/core/variable.py
Outdated
@@ -940,7 +940,11 @@ def _shift_one_dim(self, dim, count): | |||
keep = slice(None) | |||
|
|||
trimmed_data = self[(slice(None),) * axis + (keep,)].data | |||
dtype, fill_value = dtypes.maybe_promote(self.dtype) | |||
|
|||
if fill_value is None or fill_value is np.nan: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use dtypes.NA
as the default value for fill_value
, and then copy these lines from pad
to ensure that this works for arbitrary fill_values such as None
:
https://github.com/pydata/xarray/blob/master/xarray/core/variable.py#L1012-L1015
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw that, but it means that if someone passes np.nan
(the value, not the dtypes.NA
default) to a int dtype, I get -9223372036854775808
. Should we raise in that case? Or I'll see what I can do to coerce.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I'm pretty sure this would qualify as a NumPy bug :(.
I'm mostly concerned with duplicate logic that works slightly differently, so if you want to try doing this differently I'm open to it. It might actually make sense to replace most of this method to a call to Variable.pad_with_fill_value()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm mostly concerned with duplicate logic that works slightly differently
💯
Great, I'll have a look and report back
if fill_value is dtypes.NA and True: | ||
dtype, fill_value = dtypes.maybe_promote(self.dtype) | ||
else: | ||
dtype = self.dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if filler is not compatible with self.dtype?
For example, feeding np.nan to an int array.
Probably it is a part of user responsibility and we do not need to take care of this, but I am just curious of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, NumPy should raise an error... But it may not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this is the issue I'm looking at ref #2470 (comment)), good foresight @fujiisoup !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me (except for the redundant clause in the if
condition)
if fill_value is dtypes.NA and True: | ||
dtype, fill_value = dtypes.maybe_promote(self.dtype) | ||
else: | ||
dtype = self.dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, NumPy should raise an error... But it may not.
I think this is in a decent place. If Let me know thoughts! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me (but the what's new note is now in the wrong place)
* master: DEP: drop python 2 support and associated ci mods (pydata#2637) TST: silence warnings from bottleneck (pydata#2638) revert to dev version DOC: fix docstrings and doc build for 0.11.1 Source encoding always set when opening datasets (pydata#2626) Add flake check to travis (pydata#2632) Fix dayofweek and dayofyear attributes from dates generated by cftime_range (pydata#2633) silence import warning (pydata#2635) fill_value in shift (pydata#2470) Flake fixed (pydata#2629) Allow passing of positional arguments in `apply` for Groupby objects (pydata#2413) Fix failure in time encoding for pandas < 0.21.1 (pydata#2630) Fix multiindex selection (pydata#2621) Close files when CachingFileManager is garbage collected (pydata#2595) added some logic to deal with rasterio objects in addition to filepaths (pydata#2589) Get 0d slices of ndarrays directly from indexing (pydata#2625) FIX Don't raise a deprecation warning for xarray.ufuncs.{angle,iscomplex} (pydata#2615) CF: also decode time bounds when available (pydata#2571)
whats-new.rst
for all changes andapi.rst
for new API (remove if this change should not be visible to users, e.g., if it is an internal clean-up, or if this is part of a larger project that will be documented later)Should we be more defensive around which fill_values can be passed? Currently, if the array and float values have incompatible dtypes, we don't preemtively warn or cast, apart from the case of
np.nan
, which then uses the default filler