Document ability to remove all OCR

ocrmypdf · Jan 24, 2024 · cca04fd · cca04fd
1 parent 75bf8e4
commit cca04fd
Showing 1 changed file with 14 additions and 0 deletions.
diff --git a/docs/cookbook.rst b/docs/cookbook.rst
@@ -245,6 +245,20 @@ if all you want to is to apply image processing or PDF/A conversion.
     the case. Use ``--tesseract-non-ocr-timeout`` to control the timeout
     for non-OCR operations, if needed.
 
+Remove all text or OCR from my PDF
+----------------------------------
+
+This is getting ridiculous, but OCRmyPDF can complete strip all textual
+information from a PDF and reconstruct it as a "bag of images" PDF.
+
+.. code-block::
+
+    ocrmypdf --tesseract-timeout 0 --force-ocr input.pdf output.pdf
+
+Why would you want to do this? Perhaps you have a PDF where OCR
+fails to produce useful results, and just want to get rid of all OCR information.
+This command also removes OCR generated by third party tools.
+
 Optimize images without performing OCR
 --------------------------------------