A synthetic data generator for text recognition.
A fork from https://github.com/Belval/TextRecognitionDataGenerator - with addtional Thai script support.
Generating text image samples to train an OCR software. Now supporting non-latin text! For a more thorough tutorial see the official documentation.
I use Archlinux so I cannot tell if it works on Windows yet.
Python 3.X
OpenCV 4 (Works with 3.2, probably works with 2.4)
Pillow
Numpy
Requests
BeautifulSoup
tqdm
You can simply use pip install -r requirements.txt
too.
- Add
--font
to use only one font for all the generated images (Thank you @JulienCoutault!) - Add
--fit
and--margins
for finer layout control - Change the text orientation using the
-or
parameter - Change the space width using the
-sw
parameter - Specify text color range using
-tc '#000000,#FFFFFF'
, please note that the quotes are necessary - Explicit alignment when using
-al
with fixed width (0: Left, 1: Center, 2: Right) - Add support for Simplified and Traditional Chinese
python run.py -w 5 -f 64
You get 1,000 randomly generated images with random text on them like:
What if you want random skewing? Add -k
and -rk
(python run.py -w 5 -f 64 -k 5 -rk
)
You can also add distorsion to the generated text with -d
and -do
But scanned document usually aren't that clear are they? Add -bl
and -rbl
to get gaussian blur on the generated image with user-defined radius (here 0, 1, 2, 4):
Maybe you want another background? Add -b
to define one of the three available backgrounds: gaussian noise (0), plain white (1), quasicrystal (2) or picture (3).
When using picture background (3). A picture from the pictures/ folder will be randomly selected and the text will be written on it.
Or maybe you are working on an OCR for handwritten text? Add -hw
! (Experimental)
It uses a Tensorflow model trained using this excellent project by Grzego.
The project does not require TensorFlow to run if you aren't using this feature
The text is chosen at random in a dictionary file (that can be found in the dicts folder) and drawn on a white background made with Gaussian noise. The resulting image is saved as [text]_[index].jpg
There are a lot of parameters that you can tune to get the results you want, therefore I recommand checking out python run.py -h
for more informations.
It is simple! Just do python run.py -l cn -c 1000 -w 5
!
Unfortunately I do not speak Chinese so you may have to edit texts/cn.txt
to include some meaningful words instead of random glyphs.
Here are examples of what I could make with it:
Traditional:
Simplified:
Yes, the script picks a font at random from the fonts directory.
fonts/latin | English, French, Spanish, German |
fonts/cn | Chinese |
Simply add / remove fonts until you get the desired output.
If you want to add a new non-latin language, the amount of work is minimal.
- Create a new folder with your language two-letters code
- Add a .ttf font in it
- Edit
run.py
to add an if statement inload_fonts()
- Add a text file in
dicts
with the same two-letters code - Run the tool as you normally would but add
-l
with your two-letters code
It only supports .ttf for now.
- Intel Core i7-4710HQ @ 2.50Ghz + SSD (-c 1000 -w 1)
-t 1
: 363 img/s-t 2
: 694 img/s-t 4
: 1300 img/s-t 8
: 1500 img/s
- AMD Ryzen 7 1700 @ 4.0Ghz + SSD (-c 1000 -w 1)
-t 1
: 558 img/s-t 2
: 1045 img/s-t 4
: 2107 img/s-t 8
: 3297 img/s
- Create an issue describing the feature you'll be working on
- Code said feature
- Create a pull request
If anything is missing, unclear, or simply not working, open an issue on the repository.
- Better background generation
- Better handwritten text generation
- More customization parameters (mostly regarding background)
- -b 4 use color background ใช้พื้นหลังสี
- -b 5 random (0-4) สุ่ม (0-4) (0: Gaussian Noise, 1: Plain white, 2: Quasicrystal, 3: Pictures, 4: Random color)
- -bm rndInList random color in colorList สุ่มสีอักษรในรายการ //rnd: Random RGB 0 to 255, rndInList:
- -bm rnd สุ่มสีอักษร RGB (0 ถึง 255) *เมื่อใช้ Background color ควรใช้ -bm
- -tc rndInList สุ่มสีอักษรใน list //rnd: Random RGB 0 to 255, rndInList: random color in colorList
- -tc rnd คือ Random สีอักษร RGB (0 ถึง 255)
- -na 3 คือ โหมดตั้งชื่อไฟล์เป็นเลข และเขียนไฟล์ Report.csv ให้แสดงรายละเอียดแต่ละภาพ
- -d 3 random distorsion (0-3) (0: None (Default), 1: Sine wave, 2: Cosine wave)
- -rbs Ture สุ่ม random_blur และ random_skew
docker-compose up --build
Result will be in ./result