Create training or validation data for machine learning algorithms that involve handwriting, such as OCR.
Some applications involve recognizing not just single digits or letters, but numbers and words. Sometimes
a dataset of specifc words or ranges of numbers is needed. This simple script can be used to generate such
a dataset.
To generate clean images of handwritten day/month/year from the MNIST dataset, try
python simulation.py --dir ~/Desktop/mnist --data date --num 10
For noisier images, try
python simulation.py --dir ~/Desktop/mnist --speckle_noise --resize --underline_noise --data date --num 10
To generate first and last names, try
python simulation.py --dir ~/Desktop/mnist --speckle_noise --resize --underline_noise --spacing 0.7 --data name --num 10
Get an explanation of the command line arguments using
python simulation.py -h