Skip to content

Latest commit

 

History

History
89 lines (62 loc) · 2.67 KB

README.md

File metadata and controls

89 lines (62 loc) · 2.67 KB

LanyOCR

A general OCR to detect and recognize English texts in an image based on a combination of EasyOCR and PaddleOCR.

LanyOCR automatically merges text boxes into lines even for rotated texts.

alt text

alt text

Getting Started

Install dependencies

pip install lanyocr

Run example

PYTHONPATH=. python detect.py --merge_rotated_boxes true --merge_vertical true --image_path images/example1.jpg

Faster version, a bit less accurate

PYTHONPATH=. python detect.py --merge_rotated_boxes true --merge_vertical true --merge_boxes_inference true --image_path images/example1.jpg

Switch to different recognizer

PYTHONPATH=. python detect.py --merge_rotated_boxes true --merge_vertical true --recognizer_name paddleocr_en_mobile --image_path images/example1.jpg

Recognize other languages

PYTHONPATH=. python detect.py --merge_rotated_boxes true --merge_vertical true --recognizer_name paddleocr_french_mobile --image_path images/french_example1.jpg

Output image will be in outputs/output.jpg

Supported Languages

  • English: paddleocr_en_server, paddleocr_en_mobile
  • French: paddleocr_french_mobile
  • Latin: paddleocr_latin_mobile

Note: Some unicode characters cannot be visualized correctly by OpenCV, please find the text lines in the console log.

Validate accuracy

Download ICDAR 2015 dataset

bash datasets/download_icdar2015.sh

Validate accuracy

python benchmark.py

Online API

You can try LanyOCR free on RapidAPI

To Do

[x] Abstract Class/Interface for each component
    [x] LanyOcrDetector: outputs locations of text boxes        
    [x] LanyOcrMerger: merge text boxes into text lines
    [x] LanyOcrRecognizer: convert text boxes/lines into text
    [x] LanyOcrAngleClassifier: estimate the angle of a text box/line

[ ] Multi-languages support
    [X] French        
    [X] Latin
    [ ] German

[ ] Inference using multi-models to improve accuracy
    [ ] Add interface to support voting policy

[ ] Expose flags to configure each component in OCR pipeline

Known issues

[ ] Visualization step: some small texts are drawn in incorrect directions

License

This project is licensed under the MIT License.

Credits

Special thanks to authors and developers of EasyOCR and PaddleOCR projects.