Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Support list types in read_json #8335

Closed
randerzander opened this issue May 24, 2021 · 1 comment
Closed

[FEA] Support list types in read_json #8335

randerzander opened this issue May 24, 2021 · 1 comment
Assignees
Labels
cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code.

Comments

@randerzander
Copy link
Contributor

randerzander commented May 24, 2021

test.json:

{"id": 0, "val": [{"sub_id":"a"},{"sub_id": "b"}]}
{"id": 1, "val": [{"sub_id":"c"},{"sub_id":"c"}]}

In Pandas, val becomes a list of strings:

import pandas as pd
fn = 'test.json'

pd.read_json(fn, lines=True)
   id                                 val
0   0  [{'sub_id': 'a'}, {'sub_id': 'b'}]
1   1  [{'sub_id': 'c'}, {'sub_id': 'c'}]

In cudf, the file will load, but the list of JSON objects is parsed incorrectly.

import cudf
fn = 'test.json'
cudf.read_json(fn, lines=True)
   id              val sub_id
0   0  [{"sub_id":"a"}   b"}]
1   1  [{"sub_id":"c"}   c"}]
@randerzander randerzander added libcudf Affects libcudf (C++/CUDA) code. cuIO cuIO issue labels May 24, 2021
@beckernick beckernick added this to the IO Data Type Expansion milestone Aug 4, 2021
@vuule
Copy link
Contributor

vuule commented Oct 8, 2021

closing as #8827 is filed for the same feature

@vuule vuule closed this as completed Oct 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuIO cuIO issue libcudf Affects libcudf (C++/CUDA) code.
Projects
None yet
Development

No branches or pull requests

3 participants