Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Org parser fails with "/italics in quotes/" #2513

Closed
conklech opened this issue Nov 12, 2015 · 3 comments · Fixed by #2525
Closed

Org parser fails with "/italics in quotes/" #2513

conklech opened this issue Nov 12, 2015 · 3 comments · Fixed by #2525

Comments

@conklech
Copy link
Contributor

The following text is valid Org markup: "/text/". It represents a quotation mark in roman, the word "text" in italics, and a quotation mark in roman. Pandoc correctly parses that literal string, but fails to parse it in any context, e.g. the text X "/text/" X , or even just the quoted word surrounded in spaces, produces output with slashes rather than italics.

Let me know if this report isn't sufficiently clear. I may take some time to poke at the parser to see if there's a simple fix. Org-mode's own pretty-printing parser is kind of touchy with markup and special characters; for example the text /"text"/, i.e. with the quotation marks italicized, is apparently not valid Org markup.

@tarleb
Copy link
Collaborator

tarleb commented Nov 12, 2015

Pandoc tries to mostly follow Emacs' Org-Mode parser in what it recognizes as markup. As you noted, it does has its downsides, Org-Mode is a little weird in that regard.

As to the italics in quotes issue: I'll need a few more details, I wasn't able to reproduce this yet. Do you have some example code I could use? Also, just to be on the safe side: Did you make sure that pandoc knows that the input file is in Org format (e.g. by specifying --from org on the command line)?

@conklech
Copy link
Contributor Author

Ah, I missed a necessary condition. This only happens with the --smart flag. I don't have a library build of pandoc convenient, so here are some command-line test cases:

$ pandoc --version
pandoc 1.15.1

$ echo "\"/test/\"" | pandoc --from org --to native --smart
[Para [Quoted DoubleQuote [Emph [Str "test"]]]]

$ echo "X\"/test/\"" | pandoc --from org --to native --smart
[Para [Str "X",Quoted DoubleQuote [Emph [Str "test"]]]]

$ echo " \"/test/\"" | pandoc --from org --to native --smart
[Para [Quoted DoubleQuote [Str "/test/"]]]

$ echo "\"/test/\" X" | pandoc --from org --to native --smart
[Para [Quoted DoubleQuote [Emph [Str "test"]],Space,Str "X"]]

So the necessary condition seems to be having whitespace before the opening quotation mark. (Example 3.) As a result, in parsing whole files the bug is usually triggered.

As I mentioned, the org parser (or at least whatever powers the pretty-printing in emacs) accepts this format. It's occasionally necessary, and there's apparently no workaround.

(For the record: [/test/] is not accepted by either org-mode or pandoc. You need spaces on both sides. So an org footnote just containing Id., which is terribly common in legal prose, must be typed as [fn:: /Id./ ], with a space at the end. Pandoc will disregard the extra space at the end.)

@tarleb
Copy link
Collaborator

tarleb commented Nov 13, 2015

Now I see it. There is an error in the way the parser state is updated, closely related to #2504. Should be easy to fix.

Thanks for the report!

tarleb added a commit to tarleb/pandoc that referenced this issue Nov 13, 2015
Smart quotes, ellipses, and dashes should behave like normal quotes,
single dashes, and dots with respect to text markup parsing.  The parser
state was not updated properly in all cases, which has been fixed.

Thanks to @conklech for reporting this issue.

This fixes jgm#2513.
@jgm jgm closed this as completed in #2525 Nov 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants