[BUG] fall back to CPU if columnNameofCorruptRecord is in the CSV schema #2065
Labels
bug
Something isn't working
P0
Must have for release
reliability
Features to improve reliability or bugs that severly impact the reliability of the plugin
Describe the bug
Spark has this option to deal with parsing bad data. It is kind of convoluted, but there is a config where you can set the name of a column that will deal with corrupt data. Then if spark sees this column name appear in the schema for the CSV data being read Spark will place anything that it thinks is corrupt data in that string column. I don't see a lot of value in having our code support this, but we should fall back to the CPU if we see it.
The text was updated successfully, but these errors were encountered: