Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocrmypdf produces wrong page size #1360

Open
femifrak opened this issue Jul 23, 2024 · 3 comments
Open

ocrmypdf produces wrong page size #1360

femifrak opened this issue Jul 23, 2024 · 3 comments

Comments

@femifrak
Copy link

In contrast to
ocrmypdf in.pdf out.pdf
ocrmypdf --force-ocr in.pdf out.pdf
produces an output page format (115 × 200 mm) different from the input (A5, 148 × 210 mm).

I've been using pikepdf 8.14.0, ocrmypdf 16.4.1 / Tesseract OCR-hOCR 5.4.1.

in.pdf
out.pdf

@femifrak
Copy link
Author

femifrak commented Jul 24, 2024

The behaviour that-fmay change the output page size appeared first with version 16.1. (16.0.4 does not show this bug.)

This is true for both renderers (sandwich and hoc).

in.pdf is a simple file without text but the same effect happens in pdfs with text: pages will be cut off in the middle of the text.

@Jmuccigr
Copy link
Contributor

Is it perhaps this bug? #1181

@femifrak
Copy link
Author

In my case it was sufficient to use --redo-ocr instead of --force-ocr. --redo-ocr does not have that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants