-
-
Notifications
You must be signed in to change notification settings - Fork 686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to split text with non ASCII Character #1852
Comments
|
Could be relevant |
Other precision, only when the character is at the end. |
Hi! Thanks for this bug report. I’ve tried to render the HTML sample you’ve given, but I can’t reproduce this bug. Here’s some Python code that should correspond to your issue: from weasyprint import HTML
HTML(string='<h3 style="display: inline-block;"> Some title\u2028</h3>').write_pdf() Does it crash for you? |
Hello |
I’ve tried to use your HTML, but it doesn’t crash for me. Could you please share a HTML file (instead of pasting it in a comment)? |
You find attached the html file (inside the zip file)
|
It’s now fixed and tested! |
Version 59.0 ------------ Released on 2023-05-11. This version also includes the changes from unstable b1 version listed below. Bug fixes: * `#1864 <https://github.com/Kozea/WeasyPrint/issues/1864>`_: Handle overflow for svg and symbol tags in SVG images * `#1867 <https://github.com/Kozea/WeasyPrint/pull/1867>`_: Remove duplicate compression of attachments * `d0ad5c1 <https://github.com/Kozea/WeasyPrint/commit/d0ad5c1>`_: Override use tag children instead of drawing their references * `93df1a5 <https://github.com/Kozea/WeasyPrint/commit/93df1a5>`_: Don’t resize the same image twice when the --dpi option is set * `#1874 <https://github.com/Kozea/WeasyPrint/pull/1874>`_: Drawn underline and overline behind text Version 59.0b1 -------------- Released on 2023-04-14. **This version is experimental, don't use it in production. If you find bugs, please report them!** Command-line API: * The ``--optimize-size`` option and its short equivalent ``-O`` have been deprecated. To activate or deactivate different size optimizations, you can now use: * ``--uncompressed-pdf``, * ``--optimize-images``, * ``--full-fonts``, * ``--hinting``, * ``--dpi <resolution>``, and * ``--jpeg-quality <quality>``. * A new ``--cache-folder <folder>`` option has been added to store temporary data in the given folder on the disk instead of keeping them in memory. Python API: * Global rendering options are now given in ``**options`` instead of dedicated parameters, with slightly different names. It means that the signature of the ``HTML.render()``, ``HTML.write_pdf()`` and ``Document.write_pdf()`` has changed. Here are the steps to port your Python code to v59.0: 1. Use named parameters for these functions, not positioned parameters. 2. Rename some the parameters: * ``image_cache`` becomes ``cache`` (see below), * ``identifier`` becomes ``pdf_identifier``, * ``variant`` becomes ``pdf_variant``, * ``version`` becomes ``pdf_version``, * ``forms`` becomes ``pdf_forms``. * The ``optimize_size`` parameter of ``HTML.render()``, ``HTML.write_pdf()`` and ``Document()`` has been removed and will be ignored. You can now use the ``uncompressed_pdf``, ``full_fonts``, ``hinting``, ``dpi`` and ``jpeg_quality`` parameters that are included in ``**options``. * The ``cache`` parameter can be included in ``**options`` to replace ``image_cache``. If it is a dictionary, this dictionary will be used to store temporary data in memory, and can be even shared between multiple documents. If it’s a folder Path or string, WeasyPrint stores temporary data in the given temporary folder on disk instead of keeping them in memory. New features: * `#1853 <https://github.com/Kozea/WeasyPrint/pull/1853>`_, `#1854 <https://github.com/Kozea/WeasyPrint/issues/1854>`_: Reduce PDF size, with financial support from Code & Co. * `#1824 <https://github.com/Kozea/WeasyPrint/issues/1824>`_, `#1829 <https://github.com/Kozea/WeasyPrint/pull/1829>`_: Reduce memory use for images * `#1858 <https://github.com/Kozea/WeasyPrint/issues/1858>`_: Add an option to keep hinting information in embedded fonts Bug fixes: * `#1855 <https://github.com/Kozea/WeasyPrint/issues/1855>`_: Fix position of emojis in justified text * `#1852 <https://github.com/Kozea/WeasyPrint/issues/1852>`_: Don’t crash when line can be split before trailing spaces * `#1843 <https://github.com/Kozea/WeasyPrint/issues/1843>`_: Fix syntax of dates in metadata * `#1827 <https://github.com/Kozea/WeasyPrint/issues/1827>`_, `#1832 <https://github.com/Kozea/WeasyPrint/pull/1832>`_: Fix word-spacing problems with nested tags Documentation: * `#1841 <https://github.com/Kozea/WeasyPrint/issues/1841>`_: Add a paragraph about unsupported calc() function Version 58.1 ------------ Released on 2023-03-07. Bug fixes: * `#1815 <https://github.com/Kozea/WeasyPrint/issues/1815>`_: Fix bookmarks coordinates * `#1822 <https://github.com/Kozea/WeasyPrint/issues/1822>`_, `#1823 <https://github.com/Kozea/WeasyPrint/pull/1823>`_: Fix vertical positioning for absolute replaced elements Documentation: * `#1814 <https://github.com/Kozea/WeasyPrint/pull/1814>`_: Fix broken link pointing to samples Version 58.0 ------------ Released on 2023-02-17. This version also includes the changes from unstable b1 version listed below. Bug fixes: * `#1807 <https://github.com/Kozea/WeasyPrint/issues/1807>`_: Don’t crash when out-of-flow box is split in out-of-flow parent * `#1806 <https://github.com/Kozea/WeasyPrint/issues/1806>`_: Don’t crash when fixed elements aren’t displayed yet in aborted line * `#1809 <https://github.com/Kozea/WeasyPrint/issues/1809>`_: Fix background drawing for out-of-the-page transformed boxes Version 58.0b1 -------------- Released on 2023-02-03. **This version is experimental, don't use it in production. If you find bugs, please report them!** New features: * `#61 <https://github.com/Kozea/WeasyPrint/issues/61>`_, `#1796 <https://github.com/Kozea/WeasyPrint/pull/1796>`_: Support PDF forms, with financial support from Personalkollen * `#1173 <https://github.com/Kozea/WeasyPrint/issues/1173>`_: Add style for form fields Bug fixes: * `#1777 <https://github.com/Kozea/WeasyPrint/issues/1777>`_: Detect JPEG/MPO images as normal JPEG files * `#1771 <https://github.com/Kozea/WeasyPrint/pull/1771>`_: Improve SVG gradients Version 57.2 ------------ Released on 2022-12-23. Bug fixes: * `0f2e377 <https://github.com/Kozea/WeasyPrint/commit/0f2e377>`_: Print annotations with PDF/A * `0e9426f <https://github.com/Kozea/WeasyPrint/commit/0e9426f>`_: Hide annotations with PDF/UA * `#1764 <https://github.com/Kozea/WeasyPrint/issues/1764>`_: Use reference instead of stream for annotation appearance stream * `#1783 <https://github.com/Kozea/WeasyPrint/pull/1783>`_: Fix multiple font weights for @font-face declarations Version 57.1 ------------ Released on 2022-11-04. Dependencies: * `#1754 <https://github.com/Kozea/WeasyPrint/pull/1754>`_: Pillow 9.1.0 is now needed Bug fixes: * `#1756 <https://github.com/Kozea/WeasyPrint/pull/1756>`_: Fix rem font size for SVG images * `#1755 <https://github.com/Kozea/WeasyPrint/issues/1755>`_: Keep format when transposing images * `#1753 <https://github.com/Kozea/WeasyPrint/issues/1753>`_: Don’t use deprecated ``read_text`` function when ``files`` is available * `#1741 <https://github.com/Kozea/WeasyPrint/issues/1741>`_: Generate better manpage * `#1747 <https://github.com/Kozea/WeasyPrint/issues/1747>`_: Correctly set target counters in pages’ absolute elements * `#1748 <https://github.com/Kozea/WeasyPrint/issues/1748>`_: Always set font size when font is changed in line * `2b05137 <https://github.com/Kozea/WeasyPrint/commit/2b05137>`_: Fix stability of font identifiers Documentation: * `#1750 <https://github.com/Kozea/WeasyPrint/pull/1750>`_: Fix documentation spelling Version 57.0 ------------ Released on 2022-10-18. This version also includes the changes from unstable b1 version listed below. New features: * `a4fc7a1 <https://github.com/Kozea/WeasyPrint/commit/a4fc7a1>`_: Support image-orientation Bug fixes: * `#1739 <https://github.com/Kozea/WeasyPrint/issues/1739>`_: Set baseline on all flex containers * `#1740 <https://github.com/Kozea/WeasyPrint/issues/1740>`_: Don’t crash when currentColor is set on root svg tag * `#1718 <https://github.com/Kozea/WeasyPrint/issues/1718>`_: Don’t crash with empty bitmap glyphs * `#1736 <https://github.com/Kozea/WeasyPrint/issues/1736>`_: Always use the font’s vector variant when possible * `eef8b4d <https://github.com/Kozea/WeasyPrint/commit/eef8b4d>`_: Always set color and state before drawing * `#1662 <https://github.com/Kozea/WeasyPrint/issues/1662>`_: Use a stable key to store stream fonts * `#1733 <https://github.com/Kozea/WeasyPrint/issues/1733>`_: Don’t remove attachments when adding internal anchors * `3c4fa50 <https://github.com/Kozea/WeasyPrint/commit/3c4fa50>`_, `c215697 <https://github.com/Kozea/WeasyPrint/commit/c215697>`_, `d275dac <https://github.com/Kozea/WeasyPrint/commit/d275dac>`_, `b04bfff <https://github.com/Kozea/WeasyPrint/commit/b04bfff>`_: Fix many bugs related to PDF/UA structure Performance: * `dfccf1b <https://github.com/Kozea/WeasyPrint/commit/dfccf1b>`_: Use faces as fonts dictionary keys * `0dc12b6 <https://github.com/Kozea/WeasyPrint/commit/0dc12b6>`_: Cache add_font to avoid calling get_face too often * `75e17bf <https://github.com/Kozea/WeasyPrint/commit/75e17bf>`_: Don’t call process_whitespace twice on many children * `498d3e1 <https://github.com/Kozea/WeasyPrint/commit/498d3e1>`_: Optimize __missing__ functions Documentation: * `863b3d6 <https://github.com/Kozea/WeasyPrint/commit/863b3d6>`_: Update documentation of installation on macOS with Homebrew Version 57.0b1 -------------- Released on 2022-09-22. **This version is experimental, don't use it in production. If you find bugs, please report them!** New features: * `#1704 <https://github.com/Kozea/WeasyPrint/pull/1704>`_: Support PDF/UA, with financial support from Novareto * `#1454 <https://github.com/Kozea/WeasyPrint/issues/1454>`_: Support variable fonts Bug fixes: * `#1058 <https://github.com/Kozea/WeasyPrint/issues/1058>`_: Fix bullet position after page break, with financial support from OpenZeppelin * `#1707 <https://github.com/Kozea/WeasyPrint/issues/1707>`_: Fix footnote positioning in multicolumn layout, with financial support from Code & Co. * `#1722 <https://github.com/Kozea/WeasyPrint/issues/1722>`_: Handle skew transformation with only one parameter * `#1715 <https://github.com/Kozea/WeasyPrint/issues/1715>`_: Don’t crash when images are truncated * `#1697 <https://github.com/Kozea/WeasyPrint/issues/1697>`_: Don’t crash when attr() is used in text-decoration-color * `#1695 <https://github.com/Kozea/WeasyPrint/pull/1695>`_: Include language information in PDF metadata * `#1612 <https://github.com/Kozea/WeasyPrint/issues/1612>`_: Don’t lowercase letters when capitalizing text * `#1700 <https://github.com/Kozea/WeasyPrint/issues/1700>`_: Fix crash when rendering footnote with repagination * `#1667 <https://github.com/Kozea/WeasyPrint/issues/1667>`_: Follow EXIF metadata for image rotation * `#1669 <https://github.com/Kozea/WeasyPrint/issues/1669>`_: Take care of floats when remvoving placeholders * `#1638 <https://github.com/Kozea/WeasyPrint/issues/1638>`_: Use the original box when breaking waiting children
I have an html element in my document which contains the following:
<h3 style="display: inline-block;"> Some title </h3>
the special character at the end of the title is LS, a Line separator character (~ \u2028)
I got the stack trace of the error that you find right bellow.
And after digging inside
inlines.py
the functionsplit_text_box
is not able to get the right text from the element. At a point it is able to encode it to UTF-8 which result in this string:b' Some title\xe2\x80\xa8 '
, but not to complete the parsing.I was able to overcome the issue by setting
style="display: block;"
. So somehow the line break from the display block is preventing that character from being rendered. But I have many element with inline style.Is there a way to fix this without changing the styling of my entire HTML document? something like escaping Unicode non printable characters?
the WeasyPrint version that I have is 52.5. I tried with v57.1 and I got the same issue.
The text was updated successfully, but these errors were encountered: