-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Page-break in other output formats than LaTeX #1934
Comments
Correct, pandoc's internal document model does not currently contain anything corresponding to a page break, so there is no way to convert these. In principle a PageBreak element could be added. It's also possible to work around this deficiency using pandoc filters.
|
A PageBreak element would be great, but I'd be happy to use a filter in the meantime. However, I'm not sure what's entailed in doing so. How would I generate a DOCX with forced page breaks using a filtering mechanism? |
@CodeGnome : see this thread for some hints on setting up a filter for pagebreaks in docx output: |
@CodeGnome If your page breaks happen to be prior to a given heading level, you can just set the page break before property for that heading style. |
I am also voting for the feature to be added — many formats have something according to a page break _(even in CSS are things like page-break-_)*. |
Hi, I'm just looking through the code in the hope to add the pagebreak, and some features, and I found, well… Does @jgm notice the two years old pull request? |
+++ Hi-Angel [Aug 01 15 08:28 ]:
Adding a NewPage element to the definition and builder is trivial. |
If a pull request adding support for NewPage was submitted (including support in every reader and writer), would it be accepted ? |
Yes, I'd accept it if it's of good quality. Note, it requires a breaking change in pandoc-types. How do you propose to treat output formats with Would it make sense, perhaps, to render it as a
which could at least be intercepted in filters? |
I'll follow whatever recommendation you give :-) If your code snippet means empty div with a Maybe the writer could even add a inline style attribute with No need to wait for this before pushing your breaking change. To be honest, I won't look into it before at least a few weeks but it's definitely something that is on my business' road-map. |
Putting a class on an empty div won't work (or at least be portable). http://www.w3schools.com/cssref/pr_print_pageba.asp
I recently found the page-break-avoid property. I applied it to |
MDN states on
I guess with a little bit of CSS hackery, the |
OK, that's good to know. So implementing a page break in +++ Gavin S [Oct 14 16 11:44 ]:
|
Would definitely like to see this. And really would like to see printed html handle this too, but that's probably out of scope for pandoc. |
\newpage
or \pagebreak
except for PDF output files.
Some observations on how different formats handle page breaks: From the perspective of HTML/CSS, page breaking is about layout, not structure, and is thus implemented in CSS (with the In some restructured-text processors, a pagebreak can apparently also be achieved by a block level directive. On the other hand, in more imperative document models (ODT, docx, etc), pagebreak usually seems to be an inline element. The pandoc AST already has inline Finally, from the perspective of markdown, I would probably use something like this:
|
I would like to see this to implemented. I just tried to write some filter for pandoc, to use pagebreack for md to ODT, but no success. (I used the source on Google Groups, as mentioned above) |
Muse format also has pagebreaks: http://amusewiki.org/library/manual#toc7 |
btw, iA Writer pagebreak syntax is:
which produces:
which webkit-based browsers seem to understand. |
another nice workaround:
|
thanks for this! i went down this rabbit hole today. it was my first foray into haskell and i'm pleased to say that i am now standing next to a completely bald yak¹. here's what happened: the problem:
the solution:
the implementation:
import Text.Pandoc.JSON
pagebreakXml :: String
pagebreakXml = "<w:p><w:r><w:br w:type=\"page\"/></w:r></w:p>"
pagebreakBlock :: Block
pagebreakBlock = RawBlock (Format "openxml") pagebreakXml
blockSwapper :: Block -> Block
blockSwapper (Para [Str "\\newpage"]) = pagebreakBlock
blockSwapper blk = blk
main = toJSONFilter blockSwapper
|
The Lua filters repository has a pagebreak filter which converts raw |
I wanted to note that Epub3 supports page breaks as well, although for possibly different use cases.
This is nice for preserving information about page numbers (e.g. for citations, printing, or accessibility such as audio queues) without interfering with the document layout. It supports both in-line and block page breaks.
<p>
…
<span role="doc-pagebreak" id="pg24" aria-label="24"/>
…
</p>
<div role="doc-pagebreak" id="pg24">24</div> Some notes:
My personal preference is for formfeed chars to be interpreted as page breaks, at least in markdown. I use the |
This might be somewhat related. Pagebreaks seem to be automatically supported in markdown->pdf in terms of H1s being recognized as new section headers, using:
Also, when markdown->epub the same section headers H1 are recognized and page breaks are implemented. All fine and dandy. I'm wondering if it is possible somehow to have H2s recognized as section breaks as well. The main reason is because I need to have both H1 and H2 act as section breaks (page breaks). Ok, I've worked through these issues, and here is how I've dealt with them, so far: I've added
That seems to take care of the epub side. If anyone has additional suggestions/options especially for the latex/pdf side, that would be great, but otherwise I've got it working. |
Try the same thing with |
@jgm Excellent! It also supresses a page break if an H2 follows directly an H1, which is what I want. I can't seem to do that with Epub/CSS but that is less of an issue to have an extra page in an ebook, whereas one has to pay for each page in print.
Here is documentation of the various section commands that can be used with package titlesec. http://tug.ctan.org/tex-archive/macros/latex/contrib/titlesec/titlesec.pdf |
This still does not work for pandoc export to docx! |
Had to introduce page breaks to html files that are being converted to .docx, ended up with this script in Lua: function Para (el)
if #el.content == 1 and el.content[1].text == "Pagebreak" then
return pandoc.RawBlock('openxml', '<w:p><w:r><w:br w:type="page"/></w:r></w:p>')
end
end
return {
{Para = Para}
} Given the following input: <html>
<body>
<p>Page 1</p>
<p>Pagebreak</p>
<p>Page 2</p>
<p>Pagebreak</p>
<p>Page 3</p>
</body>
</html> It can be used like this:
|
Hi there, Can the support of It would be great to be able to convert DocBook to Latex without loosing this info. |
Hi,
The effects are beautifull, but I must always post-process it by hand with |
There's no native AST element corresponding to a page break. |
The R package |
Pagebreaks Don't Work for Most Output Formats
I have a Markdown file that is supposed to have pagebreaks between certain sections. However, Pandoc 1.10.1 isn't honoring the
\newpage
or\pagebreak
commands when rendering RTF, DOCX, or ODT formatted files. The commands I'm using to invoke pandoc are:PDF Seems to Work
However, the PDF format (which requires a slightly different invocation because it doesn't respect the
-t
flag) seems to respect the pagebreak requests. For example:The text was updated successfully, but these errors were encountered: