Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multiple columns to dataframe: Object type error. #4247

Closed
iisri-vu opened this issue Feb 22, 2022 · 8 comments
Closed

Add multiple columns to dataframe: Object type error. #4247

iisri-vu opened this issue Feb 22, 2022 · 8 comments

Comments

@iisri-vu
Copy link

iisri-vu commented Feb 22, 2022

System information

  • **OS Platform and Distribution: Windows 10,
  • Modin version (modin.__version__): 0.13.2
  • Python version: Python 3.8
  • Code we can use to reproduce:

cola = 1
colb = 2
colc = 5
columns= ['a', 'b', 'c']
df.loc[df['userid'].values == id, columns] = cola, colb, colc

Describe the problem

I get an error running the above code.
KeyError(array(['a'], dtype='<U6'))

Source code / logs

@mvashishtha
Copy link
Collaborator

mvashishtha commented Feb 22, 2022

@iisri-vu thank you for reporting the issue! I can't see how you initialize df in your example, but here's a small example that I think reproduces a similar error on the latest Modin version, 0.13.0+35.g47b8a1ce:

import modin.pandas as pd
df = pd.DataFrame({"a": [1]})
df.loc[:, "b"] = 2

I get a stack trace ending with KeyError: array(['b'], dtype='<U1'):

Show stack trace
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Input In [58], in <module>
----> 1 df.loc[:, "b"] = 2

File ~/modin/modin/pandas/indexing.py:741, in _LocIndexer.__setitem__(self, key, item)
    739     self.qc = self.df._query_compiler
    740 else:
--> 741     row_lookup, col_lookup = self._compute_lookup(row_loc, col_loc)
    742     super(_LocIndexer, self).__setitem__(
    743         row_lookup,
    744         col_lookup,
   (...)
    748         ),
    749     )

File ~/modin/modin/pandas/indexing.py:868, in _LocIndexer._compute_lookup(self, row_loc, col_loc)
    858     if missing_mask.any():
    859         missing_labels = (
    860             # Converting `axis_loc` to maskable `np.array` to not fail
    861             # on masking non-maskable list-like
   (...)
    866             else axis_loc
    867         )
--> 868         raise KeyError(missing_labels)
    870 if isinstance(axis_lookup, pandas.Index) and not is_range_like(axis_lookup):
    871     axis_lookup = axis_lookup.values

KeyError: array(['b'], dtype='<U1')

whereas pandas will add the new column b so that the dataframe looks like:

   a  b
0  1  2

This is a Duplicate of #3764. Please follow that issue for the fix.

@mvashishtha
Copy link
Collaborator

Duplicate of #3764

@mvashishtha mvashishtha marked this as a duplicate of #3764 Feb 22, 2022
@iisri-vu
Copy link
Author

iisri-vu commented Feb 22, 2022

Thank you,
Do we have a nightly build version of Modin?

@mvashishtha
Copy link
Collaborator

No, there's no nightly build. You'll have to pull the latest source code from github after the fix.

@devin-petersohn
Copy link
Collaborator

devin-petersohn commented Feb 22, 2022

@mvashishtha that's not completely accurate. @iisri-vu You can pip install directly from the latest github by following the instructions here: https://modin.readthedocs.io/en/stable/getting_started/installation.html#installing-from-the-github-master-branch

@iisri-vu iisri-vu changed the title Add multiple columns to dataframe: list out of range error Add multiple columns to dataframe: Object type error. Feb 22, 2022
@iisri-vu
Copy link
Author

There's still a bug in the newer version. We can't create multiple columns or add columns using this method.

@mvashishtha
Copy link
Collaborator

@iisri-vu yes, I'm sorry, but the bug #3764 is not fixed yet. I've only closed this issue so that we can track progress for both bugs in a single thread.

@iisri-vu
Copy link
Author

Thanks for the fast response. This library will be a great one when we have these features in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants