Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pd.dataframe.at won't work in modin #4111

Closed
cinnqi opened this issue Jan 29, 2022 · 5 comments
Closed

pd.dataframe.at won't work in modin #4111

cinnqi opened this issue Jan 29, 2022 · 5 comments
Labels
bug 🦗 Something isn't working

Comments

@cinnqi
Copy link

cinnqi commented Jan 29, 2022

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux
  • Modin version (modin.__version__):0.13
  • Python version:3.8
  • Code we can use to reproduce:

Describe the problem

I am trying to insert an object into a cell of data frame, by using pd.dataframe.at(index,'row', object).
it works in pandas but not in modin.

Source code / logs

File "compare_modin.py", line 98, in
df_record.loc[i,'mainsnaks'] = mainsnaks
File "/home/chengxi/.local/lib/python3.8/site-packages/modin/pandas/indexing.py", line 742, in setitem
super(_LocIndexer, self).setitem(
File "/home/chengxi/.local/lib/python3.8/site-packages/modin/pandas/indexing.py", line 412, in setitem
item = self._broadcast_item(row_lookup, col_lookup, item, to_shape)
File "/home/chengxi/.local/lib/python3.8/site-packages/modin/pandas/indexing.py", line 475, in _broadcast_item
raise ValueError(
ValueError: could not broadcast input array from shape (631,) into shape (1, 1)

@devin-petersohn
Copy link
Collaborator

Hi @cinnqi thanks for reporting!

It looks like the error happens at loc instead of at (although they are quite similar). In this line:

df_record.loc[i,'mainsnaks'] = mainsnaks

What is i and mainsnaks? It looks like i is a row label (maybe an integer) and mainsnaks is a 1D Series or array. If you could provide something I can run locally to reproduce that would really help us. Thanks again for reporting!

@cinnqi
Copy link
Author

cinnqi commented Jan 30, 2022

yes, I tried both at and loc, and none of them worked. i is a row label indeed and mainsnaks is a list of objects, format below

  [
      {
        "snaktype":"value",
        "property":"P1064",
        "datavalue":{
          "value":{
            "entity-type":"item",
            "numeric-id":3319112,
            "id":"Q3319112"
          },
          "type":"wikibase-entityid"
        },
        "datatype":"wikibase-item"
      },
      {
        "snaktype":"value",
        "property":"P31",
        "datavalue":{
          "value":{
            "entity-type":"item",
            "numeric-id":728937,
            "id":"Q728937"
          },
          "type":"wikibase-entityid"
        },
        "datatype":"wikibase-item"
      },
      ........
    ]

I am trying to insert the mainsnaks to one indicated datafram cell. Actually, I don't understand why it can work in pandas.

@mvashishtha
Copy link
Collaborator

@cinnqi thanks for describing the data. Could you please also add the line of your code that causes the error?

@devin-petersohn I think this might be another case of someone trying to use loc to append to a dataframe, as in #3764.

@devin-petersohn
Copy link
Collaborator

@mvashishtha Yes, it does appear related

@anmyachev anmyachev added the bug 🦗 Something isn't working label Apr 21, 2022
@mvashishtha
Copy link
Collaborator

@cinnqi because you haven't posted any more information, I'm closing this issue. Hopefully we can solve your bug by fixing #3764. Please comment here if you need more help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants