Not all escape character sequences are highlighted #4

geekley · 2022-04-13T23:37:23Z

Reporting upstream from: MrOrz/vscode-gettext#25

Please make escape sequences in strings other than \" be highlighted as constant.character.escape.po as well.

https://github.com/textmate/gettext.tmbundle/blob/master/Syntaxes/Gettext.tmLanguage#L108
https://github.com/textmate/gettext.tmbundle/blob/master/Syntaxes/Gettext.tmLanguage#L159
https://github.com/textmate/gettext.tmbundle/blob/master/Syntaxes/Gettext.tmLanguage#L210

At least the most important (\n) should appear, but the spec defines it as following the C syntax.
Escape characters can be misleading as you can see ("False" below), so it's important that they're represented accurately.

PO spec from: https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html

Each of untranslated-string and translated-string respects the C syntax for a character string, including the surrounding quotes and embedded backslashed escape sequences.

As listed by: https://en.cppreference.com/w/c/language/escape
also https://docs.microsoft.com/en-us/cpp/c-language/escape-sequences?view=msvc-170#escape-sequences-1
literal escapes: \' \" \? \\ \a \b \f \n \r \t \v
octal regex: \\\[0-7]{1,3} e.g.: \0 \77 \123
hex regex: \\x[0-9A-Fa-f]+ e.g.: \x0 \xabc \x0False = '\x0FA' + "lse"
unicode regex (since C99): \\u[0-9A-Fa-f]{4}|\\U[0-9A-Fa-f]{8} e.g: \u000a \U0000000D

Also, a single \ can appear at the end of a line to form a line-continuation, even inside string literals.
https://en.cppreference.com/w/c/language/translation_phases

Whenever backslash appears at the end of a line (immediately followed by the newline character), both backslash and newline are deleted, combining two physical source lines into one logical source line. This is a single-pass operation: a line ending in two backslashes followed by an empty line does not combine three lines [sic] into one.

https://docs.microsoft.com/en-us/cpp/c-language/c-string-literals?view=msvc-170#remarks

When a backslash appears at the end of a line, it is always interpreted as a line-continuation character.

Btw, I haven't tested any of these in actual gettext tools, I'm just making assumptions based on the spec. I can't confirm whether these actually work (e.g. \ at end of line or the unicode \u \U sequences, or anything else). Testing to confirm would be better.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not all escape character sequences are highlighted #4

Not all escape character sequences are highlighted #4

geekley commented Apr 13, 2022 •

edited

Loading

Not all escape character sequences are highlighted #4

Not all escape character sequences are highlighted #4

Comments

geekley commented Apr 13, 2022 • edited Loading

geekley commented Apr 13, 2022 •

edited

Loading