Skip to content

Commit

Permalink
Fix escaping of - in ms writer.
Browse files Browse the repository at this point in the history
In 5132f1e we added `-` to
the list of characters needing backslash escaping, to accommodate
a change in groff man's behavior, described here:
https://lwn.net/Articles/947941/

This change also led `-` to be escaped in ms output, but that
is wrong; `\-` in ms is a unicode minus sign.

To fix this, we add a Boolean parameter to `escapeString` in
Text.Pandoc.Writers.Roff that determines whether `-` is to
be escaped.  (NB: This is not an exported function in the API.)

Closes #10536.
  • Loading branch information
jgm committed Jan 14, 2025
1 parent 0a2e8dd commit c9460c8
Show file tree
Hide file tree
Showing 5 changed files with 21 additions and 17 deletions.
1 change: 0 additions & 1 deletion src/Text/Pandoc/RoffChar.hs
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ standardEscapes =
, ('`', "\\[ga]")
, ('^', "\\[ha]")
, ('~', "\\[ti]")
, ('-', "\\-")
, ('\\', "\\[rs]")
, ('@', "\\[at]") -- because we use @ as a table and math delimiter
, ('\x2026', "\\&...") -- because u2026 doesn't render on tty
Expand Down
6 changes: 3 additions & 3 deletions src/Text/Pandoc/Writers/Man.hs
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,9 @@ pandocToMan opts (Pandoc meta blocks) = do
Just tpl -> renderTemplate tpl context

escString :: WriterOptions -> Text -> Text
escString opts = escapeString (if writerPreferAscii opts
then AsciiOnly
else AllowUTF8)
escString opts = escapeString True (if writerPreferAscii opts
then AsciiOnly
else AllowUTF8)

-- | Return man representation of notes.
notesToMan :: PandocMonad m => WriterOptions -> [[Block]] -> StateT WriterState m (Doc Text)
Expand Down
2 changes: 1 addition & 1 deletion src/Text/Pandoc/Writers/Ms.hs
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ pandocToMs opts (Pandoc meta blocks) = do

escapeStr :: WriterOptions -> Text -> Text
escapeStr opts =
escapeString (if writerPreferAscii opts then AsciiOnly else AllowUTF8)
escapeString False (if writerPreferAscii opts then AsciiOnly else AllowUTF8)

-- In PDFs we need to escape parentheses and backslash.
-- In PDF we need to encode as UTF-16 BE.
Expand Down
11 changes: 8 additions & 3 deletions src/Text/Pandoc/Writers/Roff.hs
Original file line number Diff line number Diff line change
Expand Up @@ -74,13 +74,18 @@ combiningAccentsMap = Map.fromList combiningAccents
essentialEscapes :: Map.Map Char Text
essentialEscapes = Map.fromList standardEscapes

-- | Escape special characters for roff.
escapeString :: EscapeMode -> Text -> Text
escapeString e = Text.concat . escapeString' e . Text.unpack
-- | Escape special characters for roff. If the first parameter is
-- True, escape @-@ as @\-@, as required by current versions of groff man;
-- otherwise leave it unescaped, as neededfor ms.
escapeString :: Bool -> EscapeMode -> Text -> Text
escapeString escapeHyphen e = Text.concat . escapeString' e . Text.unpack
where
escapeString' _ [] = []
escapeString' escapeMode ('\n':'.':xs) =
"\n\\&." : escapeString' escapeMode xs
-- see #10533; we need to escape hyphens as \- in man but not in ms:
escapeString' escapeMode ('-':xs) | escapeHyphen =
"\\-" : escapeString' escapeMode xs
escapeString' escapeMode (x:xs) =
case Map.lookup x essentialEscapes of
Just s -> s : escapeString' escapeMode xs
Expand Down
18 changes: 9 additions & 9 deletions test/writer.ms
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ Here\[cq]s a regular paragraph.
In Markdown 1.0.0 and earlier.
Version 8.
This line turns into a list item.
Because a hard\-wrapped line in the middle of a paragraph looked like a list
Because a hard-wrapped line in the middle of a paragraph looked like a list
item.
.PP
Here\[cq]s one with a bullet.
Expand All @@ -149,7 +149,7 @@ Block Quotes
.pdfhref O 1 "Block Quotes"
.pdfhref M "block-quotes"
.LP
E\-mail style:
E-mail style:
.QS
.LP
This is a block quote.
Expand Down Expand Up @@ -197,7 +197,7 @@ Code:
.IP
.nf
\f[C]
\-\-\-\- (should be four hyphens)
---- (should be four hyphens)

sub status {
print \[dq]working\[dq];
Expand Down Expand Up @@ -605,7 +605,7 @@ Code block:
.IP
.nf
\f[C]
<!\-\- Comment \-\->
<!-- Comment -->
\f[]
.fi
.LP
Expand Down Expand Up @@ -695,7 +695,7 @@ LaTeX
.IP \[bu] 3
@223@
.IP \[bu] 3
@p@\-Tree
@p@-Tree
.IP \[bu] 3
Here\[cq]s some display math:
.EQ
Expand Down Expand Up @@ -765,7 +765,7 @@ Left paren: (
.PP
Right paren: )
.PP
Greater\-than: >
Greater-than: >
.PP
Hash: #
.PP
Expand All @@ -775,7 +775,7 @@ Bang: !
.PP
Plus: +
.PP
Minus: \-
Minus: -
.HLINE
.SH 1
Links
Expand Down Expand Up @@ -925,7 +925,7 @@ In a list?
.IP \[bu] 3
It should.
.LP
An e\-mail address: \c
An e-mail address: \c
.pdfhref W -D "mailto:nobody%40nowhere.net" -A "\c" \
-- "nobody\[at]nowhere.net"
\&
Expand All @@ -937,7 +937,7 @@ Blockquoted: \c
\&
.QE
.LP
Auto\-links should not occur here: \f[CR]<http://example.com/>\f[R]
Auto-links should not occur here: \f[CR]<http://example.com/>\f[R]
.IP
.nf
\f[C]
Expand Down

0 comments on commit c9460c8

Please sign in to comment.