Skip to content
/ VNocr Public

OCR script for Visual Novels/general text on images

Notifications You must be signed in to change notification settings

dotnest/VNocr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VNocr

OCR script for Visual Novels/general text on images

How it works (requirements/installation)

  1. xfce4-screenshooter
    screenshots a region and saves it to ~/ocr.png
    swap it to your own screenshot program if it doesn't work
  2. tesseract
    processes that image and outputs the text it finds to ~/ocr.txt
    install tesseract (I used sudo apt install tesseract-ocr on xubuntu)
    and download and put trained models for the language you need where they belong (it was /usr/share/tesseract-ocr/4.00/tessdata for me)
  3. tr
    cleans up output text from tesseract
  4. xclip
    passes text in clipboard with where it is caught by yomichan
    that opens a popup with that text where you can look up word definitions
    make sure to check "Enable native popups when copying Japanese text" in yomichan options
    (be careful when copying big texts that can contain kana/kanji in them)

Usage

  • Bind ocr_script to a hotkey
  • Press the hotkey
  • Select a region with text
  • Text extracted by OCR will be copied to clipboard

Additional links

Tesseract Docs - Improving Recognition Quality
StackOverflow - Remove background text and noise from an image using image processing with OpenCV
StackOverflow - How to remove background noise in image without damaging text?
StackOverflow - Background image cleaning for OCR

About

OCR script for Visual Novels/general text on images

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published