Skip to content
/ OCR Public

OCR in the Wild: Cloud and Edge Implementations

Notifications You must be signed in to change notification settings

dacquaviva/OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR in the Wild: Cloud and Edge Implementations

alt text

Overview

This repository demonstrates implementations of Optical Character Recognition (OCR) for text in the wild, designed for both cloud and edge deployment scenarios. Our goal is to provide flexible, efficient OCR solutions that can be adapted to various use cases and hardware constraints.

OCR Process

Our OCR system employs a two-step process for accurate text recognition in natural scenes:

  1. Text Detection: This step identifies and locates all instances of text within an image. It produces bounding boxes around detected text areas.

  2. Text Recognition: After detection, the identified text regions are cropped and batched. These batches are then passed through a recognition model that converts the image of text into machine-readable characters.

Repository Structure

This repository is divided into two main sections:

  1. Cloud OCR: A scalable, high-performance OCR system designed for cloud deployment.
  2. Edge OCR: A lightweight, efficient OCR model suitable for deployment on edge devices.

Cloud OCR Implementation

Our cloud-based OCR system leverages pre-trained models from PaddlePaddleOCR, converted to ONNX format for improved performance and compatibility. It uses Triton Inference Server for efficient model serving and exposes its functionality through a Flask API.

Key features of the cloud implementation:

  • Uses state-of-the-art PaddlePaddleOCR models for text detection and recognition
  • ONNX-format models for cross-platform compatibility
  • Triton Inference Server for scalable model serving
  • Flask API for easy integration with various applications
  • Designed for deployment in cloud environments
  • Efficient batching of detected text regions for recognition

For more details, see the Cloud OCR README.

Edge OCR Implementation

Our edge-based OCR system is designed to run efficiently on resource-constrained devices. It uses custom, lightweight model architectures for both detection and recognition that prioritize speed and low memory footprint without significantly compromising accuracy.

Key features of the edge implementation:

  • Custom CNN architectures optimized for edge devices
  • Uses only convolutional operations for wide hardware compatibility
  • Text detection model designed for quick region proposals
  • Text recognition model trained with CTC loss for efficient sequence learning
  • Configurable for different input sizes and character sets
  • Easily convertible to ONNX and TensorRT for further optimization
  • Optimized pipeline for efficient processing of detected text regions

For more details, see the Edge OCR README.

Comparison

Feature Cloud OCR Edge OCR
Deployment Target High-performance cloud servers Resource-constrained edge devices
Model Source Pre-trained PaddlePaddleOCR Custom-trained lightweight models (Compatible ops)
Text Detection Advanced detection model Yolov8
Text Recognition Advanced LSTM-based model Compact CNN with CTC loss
Inference Server Triton Inference Server Direct inference
Primary Advantages High accuracy, scalability, centralized Low latency, minimal resource usage, privacy, offline, cost

Getting Started

To get started with either the cloud or edge implementation:

  1. Clone this repository:
    git clone https://github.com/your-username/ocr-implementations.git
    
  2. Navigate to either the cloud or edge directory:
    cd ocr-implementations/cloud
    # or
    cd ocr-implementations/edge
    
  3. Follow the setup instructions in the respective README files.

About

OCR in the Wild: Cloud and Edge Implementations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published