[FEA] Verify unescapedQuoteHandling with CSV reader #1524

mythrocks · 2021-01-14T20:06:06Z

This arises from audit of apache/spark@433ae9064f.

Spark 3.1 has changed the behaviour of the CSV reader. It now decides whether to stop parsing at the delimiter based on the value of unescapedQuoteHandling.

spark-rapids needs to ensure that reading CSV tables through the plugin will honour the settings for unescapedQuoteHandling.

More info in the JIRA: https://issues.apache.org/jira/browse/SPARK-33566

The text was updated successfully, but these errors were encountered:

revans2 · 2021-01-15T17:10:19Z

Just for information, our CSV does not really match Spark's all that closely. We should test it, but we might just end up documenting an incompatibility.

…IDIA#1524) Signed-off-by: spark-rapids automation <[email protected]>

mythrocks added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jan 14, 2021

sameerz added audit_3.1.0 and removed ? - Needs Triage Need team to review and classify labels Jan 26, 2021

revans2 mentioned this issue Apr 1, 2021

[BUG] Fix CSV Parsing #2063

Open

38 tasks

tgravescs pushed a commit to tgravescs/spark-rapids that referenced this issue Nov 30, 2023

Update submodule cudf to a2abdb1a9e4d6737bfcab85874589057afdbae6e (NV…

430b2e7

…IDIA#1524) Signed-off-by: spark-rapids automation <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEA] Verify unescapedQuoteHandling with CSV reader #1524

[FEA] Verify unescapedQuoteHandling with CSV reader #1524

mythrocks commented Jan 14, 2021

revans2 commented Jan 15, 2021

[FEA] Verify unescapedQuoteHandling with CSV reader #1524

[FEA] Verify unescapedQuoteHandling with CSV reader #1524

Comments

mythrocks commented Jan 14, 2021

revans2 commented Jan 15, 2021