-
Notifications
You must be signed in to change notification settings - Fork 236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Support reading JSON numeric values as timestamps and dates #4940
Comments
@andygrove @jlowe @revans2 Help to check:
Details: Spark does not support reading CSV numeric UNIX timestamps as timestamps
Spark read JSON string int and int Spark code:
Different behavior for
cuDF reads as
0 indicates the epoch |
@res-life I will take a look at this later today |
@res-life
Although it should be moved to its own section to make it more visible. We hope that once CUDF finishes work on the new JSON parser that we can get some help from them to indicate if a field was quoted or not. |
Removing from 22.08. Will put it into a release once the cudf dependencies are resolved. |
It's not planned in 22.12, let me unassign myself. |
I have failed to find a case that says Spark CSV can read numeric values as timestamps, or any related config to enable this behavior. @andygrove Do you have any concern for removing the 'CSV' word? |
I am going to remove the |
Is your feature request related to a problem? Please describe.
PR #4938 adds support for reading CSV and JSON strings as timestamps and it supports valid timestamps formatted in a number of timestamp formats consisting of year, month, day, hours, minutes, seconds, and so on. However, it does not support parsing numeric values (UNIX timestamps) representing elapsed time since the epoch.
Describe the solution you'd like
Add support for GpuJsonScan for reading UNIX timestamps.
Describe alternatives you've considered
None
Additional context
See existing tests that reference this issue
The text was updated successfully, but these errors were encountered: