Extract rendered data from Docusaurus and generate PDF, the hard way
You can download it in GitHub Actions artifacts section to see the result.
This project is using the method 1 (see below) for generating PDF. You must have Prince installed on your local machine.
Install Prince first.
Run the following commands to generate PDF:
# Genrate PDF from specific site under `docs` scope
npx docusaurus-prince-pdf -u https://docusaurus.io/docs
# Change generating scope to `/docs/cli/`
npx docusaurus-prince-pdf -u https://docusaurus.io/docs/cli
# Custom working (output) directory
npx docusaurus-prince-pdf -u https://openbayes.com/docs --dest ./pdf-output
# Custom output file name
npx docusaurus-prince-pdf -u https://openbayes.com/docs --output docs.pdf
To generate PDF from a local Docusaurus instance. You need to first build the site locally:
# Build the site
(npm|yarn|pnpm) build
# Serve built site locally
(npm|yarn|pnpm) serve
# Generate PDF from local Docusaurus instance
npx docusaurus-prince-pdf -u http://localhost:4000/docs # Change port to your serving port
You can run this program with Docker image:
docker run --rm -it --init \
-v $(pwd)/pdf:/app/pdf \
openbayes/docusaurus-prince-pdf \
-u https://docusaurus.io/docs/
If you need Asiatic languages support like Chinese and Japanese. You can mount your custom fonts directory to Docker image:
docker run --rm -it --init \
-v $(pwd)/pdf:/app/pdf \
-v $(pwd)/fonts:/root/.fonts \
openbayes/docusaurus-prince-pdf \
-u https://docusaurus.io/docs/
You can also run this program inside GitHub Actions:
jobs:
build:
# prerequisites...
- name: Install Prince
run: |
curl https://www.princexml.com/download/prince-14.2-linux-generic-x86_64.tar.gz -O
tar zxf prince-14.2-linux-generic-x86_64.tar.gz
cd prince-14.2-linux-generic-x86_64
yes "" | sudo ./install.sh
- name: Build PDF
run: npx docusaurus-prince-pdf -u https://docusaurus.io/docs/
- name: Upload results
uses: actions/upload-artifact@v3
with:
name: result
# The output filename can be specified with --output option
path: pdf/docusaurus.io-docs.pdf
if-no-files-found: error
# ...other steps
You can also run prince
with prebuilt Prince Docker image:
jobs:
build:
# prerequisites...
- name: Build PDF
run: docker run --rm -it -v $(pwd)/pdf:/app/pdf openbayes/docusaurus-prince-pdf -u https://docusaurus.io/docs/
# ...other steps
You need to have Bun installed first. This can also let you run latest code on your local machine.
bun run index.ts -u http://localhost:4000/docs
--url
(-u
): Base URL, should be thebaseUrl
of the Docusaurus instance (e.g. https://docusaurus.io/docs/)--selector
(-s
): CSS selector to find the link of the next page--dest
(-d
): Working directory. Default to./pdf
--file
(-f
): Change default list output filename--output
(-o
): Change PDF output filename--include-index
: Include passed URL in generated PDF--prepend
: Prepend additional pages, split with comma--append
: Append additional pages, split with comma--prince-args
: Additional options for Prince. ie.--prince-args="--page-size='210mm 297mm'"
or--prince-args "\\-\\-page\\-size='210mm 297mm'"
--prince-docker
: Use external Prince docker image to generate PDF. See https://github.com/sparanoid/docker-prince for more info--list-only
: Fetch list without generating PDF--pdf-only
: Generate PDF without fetching list. Ensure list exists--cookie
: Specify the cookie with the domain part, e.g.--cookie="token=123456; domain=example.com;"
Like mr-pdf, this package looks for the next pagination links on generated Docusaurus site. Collect them in a list and then pass the list to Prince to generate the PDF.
You can specify the CSS selector if you're using custom Docusaurus theme:
npx docusaurus-prince-pdf -u https://openbayes.com/ --selector 'nav.custom-pagination-item--next > a'
I made a comparison list for the two methods of generating PDF from Docusaurus.
The good:
- Best font subsetting support
- Text can be selected and copy/paste correctly
- Fancy Table of Contents
The bad:
- Watermark on first page of generated PDF make it hard to handle in CI/CD environments
- Doesn't work with some CSS syntax (e.g.
mask-image
) - Doesn't work with some HTML features (e.g.
srcset
) - Commercial license is expensive ($3,800)
The ugly:
- None
Method 2: mr-pdf (not used in this project)
The good:
- Free and open-source
- Works with Docusaurus sites
- CI/CD friendly
- Based on Puppeteer make it works for most modern CSS syntax (e.g.
mask-image
)
The bad:
- Doesn't work well with system Dark Mode. You will get a dark background in generated PDF when you have
respectPrefersColorScheme
enabled in your Docusaurus instance. But it's not an issue in Ci/CD environments - No Table of Contents
The ugly:
- Based on Puppeteer make the text cannot be copied or searched correctly
- Link anchors (links start with
#
) not well handled
Usage:
npx mr-pdf --initialDocURLs="https://openbayes.com/docs/" --paginationSelector=".pagination-nav__item--next > a" --contentSelector="article"
MIT