-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support for an additional parameter in is_integer
, all_integer
, is_float
& all_float
#5130
Comments
Why is this desired behavior? |
@jrhemstad we're using these functions to specifically check if we can typecast safely. I.E. expected behavior from the Python side is to throw if a non-integer or non-float is detected because we can't safely typecast it. On the other hand, we can safely carry over nulls without issue so the value is irrelevant. Alternative, we could return |
I don't understand why this requires returning If you want to ensure all of the elements in a column are valid integers before typecasting, you run What am I missing? |
I believe that currently |
Ahhhh, now this makes sense. Okay. |
So, in this PR: #5054, we have used
But >>> import cudf
>>> import cudf._lib.strings.char_types as c
>>> s = cudf.Series(["10", np.nan, "1"])
>>> cudf.Series(c.is_integer(s._column)).all()
True
>>> s = cudf.Series(["abc", None, "1"])
>>> cudf.Series(c.is_integer(s._column))
0 False
1 null
2 True
dtype: bool
>>> s = cudf.Series(["10", None, "1"])
>>> cudf.Series(c.is_integer(s._column)).all()
True
>>> c.all_integers(s._column)
False
>>> cudf.Series(c.is_float(s._column))
0 True
1 null
2 True
dtype: bool
>>> cudf.Series(c.is_float(s._column)).all()
True
>>> c.all_floats(s._column)
False So, if this is expected behavior of |
Happy to be incorrect here then! Seems like we can just use |
Apologies for the confusion Keith & Jake. Shall we close this issue then? or wait for comments from @jrhemstad ? cc: @davidwendt |
@galipremsagar I think if we're good to go with using Maybe raise an issue that we should consider removing |
Sure 👍 Closing this FEA. |
Is your feature request related to a problem? Please describe.
The current APIs(
is_integer
,all_integer
,is_float
&all_float
) returnFalse
if there are null values in the input, But since we actually differ with pandas in this aspect and we supportnull
values inint
&float
columns, we'd like to have the above APIs returnTrue
when there is anull
in the input column.Describe the solution you'd like
From the discussion we had here: #5054 (comment) It appears to be that we'd like to have an additional parameter to support this feature.
Describe alternatives you've considered
This is currently being handled on the python side with performing a
dropna
and then calling the above APIs.Additional context
This request came up as part of #5054
The text was updated successfully, but these errors were encountered: