Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Implement more efficient parsing of dates and timestamp from JSON and CSV #4969

Open
andygrove opened this issue Mar 16, 2022 · 1 comment
Labels
P1 Nice to have for release performance A performance related task/issue

Comments

@andygrove
Copy link
Contributor

Is your feature request related to a problem? Please describe.
PR #4938 makes timestamp parsing more compatible with Spark when reading from JSON and CSV but it is very expensive because it has to apply regular expressions as well as performing multiple passes with different cuDF formats.

Describe the solution you'd like
We should implement a custom kernel instead.

Describe alternatives you've considered
None

Additional context
None

@andygrove andygrove added feature request New feature or request ? - Needs Triage Need team to review and classify labels Mar 16, 2022
@sameerz sameerz added performance A performance related task/issue and removed feature request New feature or request ? - Needs Triage Need team to review and classify labels Mar 22, 2022
@res-life
Copy link
Collaborator

res-life commented Mar 23, 2022

I'm intrested in this, if nobody picks it, let me try.

@mattahrens mattahrens added the P1 Nice to have for release label May 11, 2022
@revans2 revans2 mentioned this issue Oct 27, 2022
38 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 Nice to have for release performance A performance related task/issue
Projects
None yet
Development

No branches or pull requests

4 participants