Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

easyocr.Reader.readtext(...) in rare occasions returns bounding box with float coordinates, and not int #1307

Open
ScheiBig opened this issue Sep 12, 2024 · 2 comments

Comments

@ScheiBig
Copy link

I'm using EasyOCR to make very simple banknote recognition for uni project, using Python and OpenCV. If I understand correctly provided examples, code below:

import cv2
import numpy as np
import easyocr

cap = cv2.VideoCapture(...)
reader = easyocr.Reader(["en"])
did_read, frame = cap.read()

# some frame preprocessing if necessary - cropping to area of interest, adding filters and thresholding

read_txts = reader.readtext(processed_frame)

should produce result, which I type-hint in my code as:

eOcr_res = tuple[
	tuple[tuple[int, int], tuple[int, int], tuple[int, int], tuple[int, int]], # bounding box
	str, # label
	float # confidence (0.0 .. 1.0)
]

of course actual result uses lists instead of tuples for bounding box, but this allows slightly better type-checking, since you cannot type-hint list with constant length, but this doesn't really matter.

What does matter, is that it should be possible to use this output directly, to draw result on image using OpenCV:

for read_txt in read_txts:
	box, txt, conf = read_txt
	box = np.array(box)
	cv2.putText(
		frame,
		txt,
		tuple(box.max(axis= 0)),
		0.75,
		(0, 255, 0),
		1
	)
	cv2.drawContours(
		frame,
		[box],
		-1,
		(0, 255, 0),
		2
	)

However on some rare occasions, snippet of code above would throw on cv2.drawContours, with error message:
cv2.error: OpenCV(4.10.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\drawing.cpp:2504: error: (-215:Assertion failed) npoints > 0 in function 'cv::drawContours'

On closer inspection in debugger, it seems that when error occurs, reader.readtext(...) returns result in which bounding box points are of type float, and not the expected int (int points are being returned +99% of times):
image

Of course this can be fixed in user code, which in snipped above would be:

box = np.array(box, dtype= np.int_)

however I feel that either examples in this repository in readme.md and on site https://www.jaided.ai/easyocr/tutorial/ are misleading, showing that only int numbers can be expected in bounding box component of output, or there is some rare bug which results in non-integer output.

@ScheiBig
Copy link
Author

Sorry that I forgot to specify, I'm using EasyOCR version 1.7.1, with Python 3.12.6 on Windows 11.

@daniellovera
Copy link

Bounding boxes returned as polys can be returned with float coords.

From the API documentation - "Return horizontal_list, free_list - horizontal_list is a list of regtangular text boxes. The format is [x_min, x_max, y_min, y_max]. free_list is a list of free-form text boxes. The format is [[x1,y1],[x2,y2],[x3,y3],[x4,y4]]."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants