A tiny python PDF generation library which is intentionally not based on wkhtmltopdf. Rather, it uses Chromium with the playwright library (a python implementation of Google's Puppeteer protocol).
pip install git+https://github.com/lennyerik/pyhtmltopdf.git
Or, for development:
git clone https://github.com/lennyerik/pyhtmltopdf.git
cd pyhtmltopdf
pip install -e ".[dev]"
Simple use-cases can use the from_file
, from_url
and from_string
(or their async-enabled variants afrom_file
, afrom_url
, afrom_string
) functions:
from pyhtmltopdf import from_file
# Convert input.html to output.pdf
from_file(
"./input.html",
"output.pdf",
render_options={
"margin": {
"top": "3cm",
"left": "2cm",
"right": "2cm",
"bottom": "3cm",
},
},
)
from pyhtmltopdf import from_url
# We can also write the file ourselves
out_file = open("output.pdf", "wb")
out_file.write(from_url(
"https://example.com/",
header_html="<p style=\"font-size: 12pt;\">This is a demo header</p>",
footer_html="<p style=\"font-size: 12pt;\">Page No: <span class=\"pageNumber\"></span></p>",
render_options={
"margin": {
"top": "2cm",
"bottom": "2cm",
}
},
))
If you already have a chromium browser installed, you can add executable_path
to the launch_options
like so:
from_file(
"./input.html",
"output.pdf",
launch_options={
# This example uses Brave as the chromium-based browser
"executable_path": "/usr/bin/brave",
}
render_options={
"margin": {
"top": "3cm",
"left": "2cm",
"right": "2cm",
"bottom": "3cm",
},
},
)
In case you want to process multiple PDFs, the class based API is faster, since it only spins up one Chromium instance:
from pyhtmltopdf import HTMLToPDFConverter
with HTMLToPDFConverter() as converter:
converter.from_url(
"https://example.com/",
"output.pdf",
)
# Or, alternatively
converter = HTMLToPDFConverter(launch_options={
# Launch options are passed in here
"executable_path": "/usr/bin/brave",
})
converter.init()
converter.from_url(
"https://example.com/",
"output.pdf",
)
converter.finish()
Or even asynchronously:
from pyhtmltopdf import AHTMLToPDFConverter
async with AHTMLToPDFConverter() as converter:
await converter.from_url(
"https://example.com/",
"output.pdf",
)
# Or, alternatively
converter = AHTMLToPDFConverter(launch_options={
# Launch options are passed in here
"executable_path": "/usr/bin/brave",
})
await converter.init()
await converter.from_url(
"https://example.com/",
"output.pdf",
)
await converter.finish()
All from_x
functions have the following parameters:
file_path / url / string
: The input HTML / URL / string to processoutput_path
: An optional output path to save the PDF to. Defaults toNone
header_html
: An optional HTML string for the page header. Defaults to""
footer_html
: An optional HTML string for the page header. Defaults to""
render_options
: Can be any of these PDF rendering options
Additionally, the top-level from_x
functions as well as the constructors of the HTMLToPDFConverter
and AHTMLToPDFConverter
classes take the launch_options
argument which can be any of these launch options.
To format the code, install with dev dependencies and run
black .
To run the unit tests, install with dev dependencies and execute
python -m unittest test
You can also set a specific browser:
CHROMIUM=/usr/bin/brave python -m unittest test
pyhtmltopdf uses an up-to-date version of Chromium, enabling use of features such as flexbox, which are not supported by wkhtmltopdf's old version of WebKit. Furthermore, the current status of the wkhtmltopdf project is questionable:
- the version of Qt it uses is unsupported since 2015
- it requires a patch to qt in order to enable full header and footer support
- the version of WebKit it uses is over 4 years old