Daemon type OCR #242

marcincichocki · 2021-10-13T17:55:08Z

When I started I left out daemon type ocr for few reasons:

However, since other features are completed, I thought I would revisit this idea. Here are my thoughts after few hours of testing:

Language data is available for free on github(license apache 2.0).
Loading language takes a while(300ms) so it's a question how to distribute it and how to load it so it's responsive.
Fragment preprocessing got big upgrade, by extracting blue channel from source I was able to clear every noise/border and other junk that would just clutter the image. This is must be implemented for daemon fragment.
I also noticed that sharp is quite slow(150ms), I might need to see how much time does it take to process images on production.
Text recognition is surprisingly easy, most data from registry was easy to decode, and text seems correct(can't validate for some exotic language but it looks good).
It's super fast, and with worker support performance impact will be negligible.
In the event of failure daemon could be marked as UNKNOWN or UNDETECTED.
Daemon type would always be recognized to its id.
I need list of every daemon and screenshots for for them in every language so I can be sure they work.
I also need to know what to do when wrong ocr language is selected.
with data in place, there has to be some use for it, as first step I could add it to the viewer, later on maybe create sequence sort based on type.

Overall there is lots to do.

The text was updated successfully, but these errors were encountered:

marcincichocki added this to the v2.3.0 milestone Oct 13, 2021

marcincichocki closed this as completed in 3683dac Oct 21, 2021

Provide feedback