Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] look into CSV support for multi-line and ignore leading and trailing space #130

Open
Tracked by #2063
revans2 opened this issue Jun 9, 2020 · 1 comment
Open
Tracked by #2063
Labels
feature request New feature or request P2 Not required for release SQL part of the SQL/Dataframe plugin

Comments

@revans2
Copy link
Collaborator

revans2 commented Jun 9, 2020

Is your feature request related to a problem? Please describe.
The CSV options for multi-line and ignore leading and trailing space are currently not really supported all that well. It would be great to look into what it would take to actually support these properly, or at least better than we do right now.

@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify SQL part of the SQL/Dataframe plugin labels Jun 9, 2020
@sameerz sameerz added P2 Not required for release and removed ? - Needs Triage Need team to review and classify labels Aug 18, 2020
@revans2
Copy link
Collaborator Author

revans2 commented Apr 1, 2021

The white space handling should probably be split off from multi-line support.

CUDF has inconsistent white space handling, but if we can clean it up and get it working properly for strings then we could use the CSV parser to pull back strings for everything, and then clean those stings up with similar code we use for casting.

@revans2 revans2 mentioned this issue Apr 1, 2021
38 tasks
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
[auto-merge] bot-auto-merge-branch-22.04 to branch-22.06 [skip ci] [bot]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request P2 Not required for release SQL part of the SQL/Dataframe plugin
Projects
None yet
Development

No branches or pull requests

2 participants