Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qsv fmt --ascii should not use " as the quote character #1074

Closed
LemmingAvalanche opened this issue Jun 27, 2023 · 0 comments · Fixed by #1075
Closed

qsv fmt --ascii should not use " as the quote character #1074

LemmingAvalanche opened this issue Jun 27, 2023 · 0 comments · Fixed by #1075

Comments

@LemmingAvalanche
Copy link
Contributor

Is your feature request related to a problem? Please describe.

qsv fmt --ascii uses the ASCII unit separator (U+001F) as the
record delimiter and the ASCII record separator (U+001E) as the record
terminator. The point is to use metacharacters that don’t appear in
regular ASCII text; these characters are neither part of the printable
character subset, nor the whitespace control codes.

But it still uses " as the quote character. That means that data that
contains quotes will need to be quoted using this printable character.

Example CSV:

"he""ll""o","world"
"goodbye","world"

Formatting that with --ascii (using escape codes for unit/record):

$ cargo run --features="feature_capable" --bin=qsv fmt --ascii test-ascii.txt
"he""ll""o"\x1fworld\x1egoodbye\x1fworld\x1e

So you’ve avoided the \n and , metacharacters. But the regular-text
metacharacter " is still there.

Without " as the quote character you would be able to use any kind of
printable/whitespace Unicode text without having to quote any of the data.

Describe the solution you'd like

Use another non-whitespace control code as the quote character. For
example “Substitute” (U+001A).

Describe alternatives you've considered

You can achieve this by adding --quote=$'\x1A':

cargo run --features="feature_capable" --bin=qsv fmt --ascii --quote=$'\x1A' test-ascii.txt

Additional context

_

LemmingAvalanche added a commit to LemmingAvalanche/qsv that referenced this issue Jun 27, 2023
`qsv fmt --ascii` uses the ASCII unit separator (U+001F) as the
record *delimiter* and the ASCII record separator (U+001E) as the record
*terminator*. The point is to use metacharacters that don’t appear in
regular ASCII text; these characters are neither part of the printable
character subset, nor the whitespace control codes.

But it still uses `"` as the quote character. That means that data that
contains quotes will need to be quoted using this printable character.

All metacharacters (delimiter, terminator, and quote) being non-whitespace
control codes would enable you to write out any regular text (printable
character or whitespace) without “quotations”.

Use the control code ASCII “Substitute” (U+001A) as the quote character.

Fixes: dathere#1074
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant