-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix JSON parsing in json to struct function #8204
Comments
@cindyyuanjiang @revans2 I think this issue can be closed now? When reading JSON we ask cuDF for strings for all primitive type and then cast to the primitive type in the plugin. Relevant code is in |
If the tests pass, then I am good with closing this. |
Ok, so there are still differences in how we handle types between from_json and json scan, so this is still valid. I will work on making these consistent. |
I think we will need this cuDF feature before we can resolve this issue: |
I propose that we make @revans2 Does that sound reasonable to you? |
That sounds good to me. |
The main part of the proposed fix is already covered by #10542 and follow on issues have been filed for remaining issues related to this. |
Is your feature request related to a problem? Please describe.
This is to follow up with json to struct function in #8174. The current JSON parsing in the implementation is having CUDF do name and type inference, which has issue with parsing and error handling. In particular, with parsing ints, if the int is too large or the format is not exactly correct, then CUDF will return a partial value (best effort to get something out) but spark is strict and will return a null.
Describe the solution you'd like
"We have put in a lot of work to make our cast code do the right thing when going from a string to an int, so if we ask CUDF to return Strings values instead of int values and then cast them afterwards we get the parsing that we want."
The text was updated successfully, but these errors were encountered: