Purpose
Create training or validation data for machine learning algorithms that involve handwriting, such as OCR.
Some applications involve recognizing not just single digits or letters, but numbers and words. Sometimes
a dataset of specifc words or ranges of numbers is needed. This simple script can be used to generate such
a dataset.
Usage
To generate clean images of handwritten day/month/year from the MNIST dataset, try
python simulation.py --dir ~/Desktop/mnist --data date --num 10
For noisier images, try
python simulation.py --dir ~/Desktop/mnist --speckle_noise --resize --underline_noise --data date --num 10
To generate first and last names, try
python simulation.py --dir ~/Desktop/mnist --speckle_noise --resize --underline_noise --spacing 0.7 --data name --num 10
Get an explanation of the command line arguments using
python simulation.py -h