Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] dask_cudf and cudf not able to cast to str dtype if using dict #8616

Closed
kshitizgupta21 opened this issue Jun 27, 2021 · 0 comments · Fixed by #8618
Closed

[BUG] dask_cudf and cudf not able to cast to str dtype if using dict #8616

kshitizgupta21 opened this issue Jun 27, 2021 · 0 comments · Fixed by #8618
Assignees
Labels
bug Something isn't working Python Affects Python cuDF API.

Comments

@kshitizgupta21
Copy link

Describe the bug
dask cudf and cudf error out if trying to pass in dict for casting to string dtype in .astype

Steps/Code to reproduce bug

import dask_cudf
import cudf

# cast col_a to string and col_b to int
df = cudf.DataFrame({'col_a' : [1, 2, 4],
                     'col_b' : ['11', '22', '44']
                    })


ddf = dask_cudf.from_cudf(df, npartitions=2)

ddf.astype({'col_a': str,
            'col_b' : int})
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-111-523edc1889ba> in <module>
----> 1 ddf.astype({'col_a': 'str', 'col_b' : int})

/opt/conda/lib/python3.8/site-packages/dask/dataframe/core.py in astype(self, dtype)
   2666             meta = self._meta_nonempty.astype(dtype)
   2667         else:
-> 2668             meta = self._meta.astype(dtype)
   2669         if hasattr(dtype, "items"):
   2670             set_unknown = [

/opt/conda/lib/python3.8/site-packages/cudf/core/dataframe.py in astype(self, dtype, copy, errors, **kwargs)
   1208             for col_name in current_cols:
   1209                 if col_name in dtype:
-> 1210                     result._data[col_name] = self._data[col_name].astype(
   1211                         dtype=dtype[col_name],
   1212                         errors=errors,

/opt/conda/lib/python3.8/site-packages/cudf/core/column/column.py in astype(self, dtype, **kwargs)
   1040             str,
   1041         }:
-> 1042             return self.as_string_column(dtype, **kwargs)
   1043         elif is_list_dtype(dtype):
   1044             if not self.dtype == dtype:

TypeError: as_string_column() got an unexpected keyword argument 'errors'
df.astype({'col_a': str,
           'col_b' : int})
Same error as above

Expected behavior
Expected the conversion to go through.

Environment details
cuDF v.0.19.1 run through NVTabular Merlin v0.5.3 NGC Merlin Container

@kshitizgupta21 kshitizgupta21 added Needs Triage Need team to review and classify bug Something isn't working labels Jun 27, 2021
@shwina shwina added Python Affects Python cuDF API. and removed Needs Triage Need team to review and classify labels Jun 28, 2021
@shwina shwina self-assigned this Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Affects Python cuDF API.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants