Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Clarify DataFrame.where tries to maintain dtype of DataFrame when providing fill_value #48373

Closed
mroeschke opened this issue Sep 2, 2022 · 2 comments · Fixed by #48380
Closed
Labels
Docs Dtype Conversions Unexpected or buggy dtype conversions

Comments

@mroeschke
Copy link
Member

In [1]: In [1]: psr = pd.Series(range(10), dtype="int16")

In [2]: In [4]: fill_value = 100.0

In [3]: In [5]: expect = psr.where(psr > 0, fill_value)

1.4.3
In [4]: expect
Out[4]:
0    100.0
1      1.0
2      2.0
3      3.0
4      4.0
5      5.0
6      6.0
7      7.0
8      8.0
9      9.0
dtype: float64

1.5
In [6]: expect
Out[6]:
0    100
1      1
2      2
3      3
4      4
5      5
6      6
7      7
8      8
9      9
dtype: int16

Bisected to

Author: @jbrockmendel
Date: Sun Jan 30 15:45:55 2022 -0800

REF: avoid upcast/downcast in Block.where (#45582)

Seems like an intentional change, but it's not really clear from the where docs whether fill_value dtype or DataFrame dtype takes priority.

@mroeschke mroeschke added Bug Dtype Conversions Unexpected or buggy dtype conversions labels Sep 2, 2022
@mroeschke mroeschke added this to the 1.5 milestone Sep 2, 2022
@jbrockmendel
Copy link
Member

intentional change bc the 100.0 can be cast to int16 losslessly

@mroeschke
Copy link
Member Author

Okay changing to a documentation issue then.

@mroeschke mroeschke changed the title BUG(?): DataFrame[ints].where(fill_value=float) no longer upcasts results DOC: Clarify DataFrame.where tries to maintain dtype of DataFrame Sep 2, 2022
@mroeschke mroeschke added Docs and removed Bug labels Sep 2, 2022
@mroeschke mroeschke changed the title DOC: Clarify DataFrame.where tries to maintain dtype of DataFrame DOC: Clarify DataFrame.where tries to maintain dtype of DataFrame when providing fill_value Sep 2, 2022
@mroeschke mroeschke removed this from the 1.5 milestone Sep 2, 2022
rapids-bot bot pushed a commit to rapidsai/cudf that referenced this issue Sep 21, 2022
This PR introduces `pandas-1.5` support in `cudf`. The changes include:

- [x] Requires `group_keys` support in `groupby` for `dask_cudf` to work: #11659
- [x] Requires `zfill` updates to match `pandas-1.5` behavior: #11634
- [x] `where` API: Ability to inspect a scalar value if it can be fit into the existing dtype, similar to: pandas-dev/pandas#48373
- [x] Switches `ValueError` to `TypeError` when an unknown category is being set to a `CategoricalColumn`
- [x] Handles breaking change of an `ArrowIntervalType` related import that has resulted in `cudf` to error on import itself.
- [x] Fix an issue with `IntervalColumn.to_pandas`.
- [x] Raises error when an object of `boolean` dtype is being set to a `NumericalColumn`.
- [x] Raises error when `pat` is None in `Series.str.startswith` & `Series.str.endswith`.
- [x] Add `IntervalDtype.to_pandas` with appropriate versioning.
- [x] Handle `get_window_bounds` signature changes.
- [x] Fix and version a bunch of pytests.

```python
branch-22.10:

== 4275 failed, 79837 passed, 2049 skipped, 1193 xfailed, 1923 xpassed, 6597 warnings, 4 errors in 1103.52s (0:18:23) ==
== 803 failed, 106 passed, 14 skipped, 14 xfailed, 324 warnings, 17 errors in 148.46s (0:02:28) ==

This PR:

== 84041 passed, 2049 skipped, 1199 xfailed, 1710 xpassed, 6599 warnings in 359.27s (0:05:59) ==
== 954 passed, 14 skipped, 7 xfailed, 3 xpassed, 580 warnings in 54.75s ==
```

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Ashwin Srinath (https://github.com/shwina)
  - Matthew Roeschke (https://github.com/mroeschke)
  - Mark Sadang (https://github.com/msadang)

URL: #11617
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants