Skip to content

Commit

Permalink
[fix](csv reader) fix csv parser incorrect if enclosing line_delimiter (
Browse files Browse the repository at this point in the history
#38347)

Csv reader parse data incorrect when data enclosing line_delimiter, for
example, line_delimiter is \n and enclose is ', data as follows:
```
'aaaaaaaaaaaa
bbbb'
```
it will be parsed as two columns: `'aaaaaaaaaaaa` and `bbbb',` rather
than one column
```
'aaaaaaaaaaaa
bbbb'
```

The reason why this happened is csv reader will not reset result when
not match enclose in this `output_buf_read`, causing incorrect
truncation was made.

Co-authored-by: Xin Liao <[email protected]>
  • Loading branch information
2 people authored and dataroaring committed Jul 29, 2024
1 parent a63832c commit 32102f7
Showing 1 changed file with 5 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,11 @@ void EncloseCsvLineReaderContext::_on_pre_match_enclose(const uint8_t* start, si
if (_idx != _total_len) {
len = update_reading_bound(start);
} else {
// It needs to set the result to nullptr for matching enclose may not be read
// after reading the output buf.
// Therefore, if the result is not set to nullptr,
// the parser will consider reading a line as there is a line delimiter.
_result = nullptr;
break;
}
} while (true);
Expand Down

0 comments on commit 32102f7

Please sign in to comment.