Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce skiptitlepage for docx output #9069

Closed
tdewin opened this issue Sep 8, 2023 · 5 comments
Closed

Introduce skiptitlepage for docx output #9069

tdewin opened this issue Sep 8, 2023 · 5 comments

Comments

@tdewin
Copy link

tdewin commented Sep 8, 2023

Describe your proposed improvement and the problem it solves.
Currently there is a forced title page generated based on the meta data. Since generating a custom "whitepaper", the first page is some custom graphics, But I would like to keep the meta data set in the document properties. I tried looking at lua filters but they only modify the AST tree not the docx write.

If I would know/understand haskel, I would try to edit but as far as I can tell the fix should be something like

Some switch called skiptitlepage
pandoc/src/Text/Pandoc/Writers/Docx.hs Line : 764

  let notIncludeMetaPage = lookupMetaBool "skiptitlepage" meta

Then where the doc is generated, make meta only the toc if skipped

pandoc/src/Text/Pandoc/Writers/Docx.hs Line : 812

  meta' <- if notIncludeMetaPage
    then return map Elem toc
    else return title ++ subtitle ++ authors ++ date ++ abstract ++ map Elem toc

  return (meta' ++ doc', notes', comments')

Describe alternatives you've considered.

  • Zip unzip the docx after generation and sed remove it
  • changing the style but it would still render the data
  • Lua filter on block, but it starts at the very first item (AST)
@jgm
Copy link
Owner

jgm commented Sep 8, 2023

Relevant: #7256, #2928.
In general, it would be nice if we could make the reference.docx behave more like a template, so that you could remove the title and author.

@tdewin
Copy link
Author

tdewin commented Sep 8, 2023

Thanks for creating pandoc and responding so quickly

I have worked around this by not supplying any meta data at all, then unzip/zip and overwriting core.xml manually. It is pretty crude, eg you need to escape & with & etc.

OUTFILE="pandoc-output.docx"
REMASTER="${OUTFILE%.*}-remaster"
unzip -o -qq $OUTFILE -d $REMASTER
PREVPATH=$(pwd)
cd $REMASTER

cat <<'EOF' > docProps/core.xml
<?xml version="1.0" encoding="UTF-8"?>
<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dcmitype="http://purl.org/dc/dcmitype/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <dc:title>title</dc:title>
    <dc:creator>author1;author2</dc:creator>
    <dc:description>description</dc:description>
    <dc:language>en-us</dc:language>
    <cp:keywords>some, key, words</cp:keywords>
    <dcterms:created xsi:type="dcterms:W3CDTF">2023-01-01T00:00:00Z</dcterms:created>
    <dcterms:modified xsi:type="dcterms:W3CDTF">2023-01-01T00:00:00Z</dcterms:modified>
</cp:coreProperties>
EOF

zip -r "$REMASTER.docx" *
cd $PREVPATH
  
#for safety
#rm -rf $REMASTER

Similar, I've removed the toc completely but include it with include lua filter https://github.com/pandoc-ext/include-files/blob/main/include-files.lua and then this segment https://gist.github.com/tdewin/e8cb8bc35a0fbabb630ff3f4a033ab77

Hopefully this is useful for somebody out there

@ghost
Copy link

ghost commented Dec 2, 2023

One option for this specific issue would be for the Docx Writer to use "title-meta", "author-meta" fields like HTML and LaTeX do. That is, have the Docx writer set the document metadata from the "title-meta" etc. fields, and set the title block from "title", "author" fields, setting "title-meta" etc. automatically from "title" if present. That way, if we want to set the metadata without generating the title block, we can set the "title-meta" fields and omit "title". This doesn't look like it would be a huge change (though I can barely read Haskell, so I could be missing something), and would make the behavior consistent with other writers.

@jgm
Copy link
Owner

jgm commented Dec 2, 2023

Still, you'd have to set title-meta, etc. manually, which doesn't seem ideal.
Btw, the reason for title-meta in HTML and other formats is that sometimes there are restrictions on what formatting can go in these fields.

@aditivin
Copy link

Thanks for creating pandoc and responding so quickly

I have worked around this by not supplying any meta data at all, then unzip/zip and overwriting core.xml manually. It is pretty crude, eg you need to escape & with & etc.

OUTFILE="pandoc-output.docx"
REMASTER="${OUTFILE%.*}-remaster"
unzip -o -qq $OUTFILE -d $REMASTER
PREVPATH=$(pwd)
cd $REMASTER

cat <<'EOF' > docProps/core.xml
<?xml version="1.0" encoding="UTF-8"?>
<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
    xmlns:dcterms="http://purl.org/dc/terms/"
    xmlns:dcmitype="http://purl.org/dc/dcmitype/"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <dc:title>title</dc:title>
    <dc:creator>author1;author2</dc:creator>
    <dc:description>description</dc:description>
    <dc:language>en-us</dc:language>
    <cp:keywords>some, key, words</cp:keywords>
    <dcterms:created xsi:type="dcterms:W3CDTF">2023-01-01T00:00:00Z</dcterms:created>
    <dcterms:modified xsi:type="dcterms:W3CDTF">2023-01-01T00:00:00Z</dcterms:modified>
</cp:coreProperties>
EOF

zip -r "$REMASTER.docx" *
cd $PREVPATH
  
#for safety
#rm -rf $REMASTER

Similar, I've removed the toc completely but include it with include lua filter https://github.com/pandoc-ext/include-files/blob/main/include-files.lua and then this segment https://gist.github.com/tdewin/e8cb8bc35a0fbabb630ff3f4a033ab77

Hopefully this is useful for somebody out there

Thank you, very helpful! The openxml TOC format worked for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants