Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Verify unescapedQuoteHandling with CSV reader #1524

Open
mythrocks opened this issue Jan 14, 2021 · 1 comment
Open

[FEA] Verify unescapedQuoteHandling with CSV reader #1524

mythrocks opened this issue Jan 14, 2021 · 1 comment
Labels
audit_3.1.0 feature request New feature or request

Comments

@mythrocks
Copy link
Collaborator

This arises from audit of apache/spark@433ae9064f.

Spark 3.1 has changed the behaviour of the CSV reader. It now decides whether to stop parsing at the delimiter based on the value of unescapedQuoteHandling.

spark-rapids needs to ensure that reading CSV tables through the plugin will honour the settings for unescapedQuoteHandling.

More info in the JIRA: https://issues.apache.org/jira/browse/SPARK-33566

@mythrocks mythrocks added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jan 14, 2021
@revans2
Copy link
Collaborator

revans2 commented Jan 15, 2021

Just for information, our CSV does not really match Spark's all that closely. We should test it, but we might just end up documenting an incompatibility.

@sameerz sameerz added audit_3.1.0 and removed ? - Needs Triage Need team to review and classify labels Jan 26, 2021
@revans2 revans2 mentioned this issue Apr 1, 2021
38 tasks
tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
audit_3.1.0 feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants