Skip to content

Generate handwriting training data by composing characters: numbers using MNIST or names using EMNIST

License

Notifications You must be signed in to change notification settings

sjfleming/composable_MNIST

Repository files navigation

Generate images by composing MNIST digits or EMNIST letters

Purpose

Create training or validation data for machine learning algorithms that involve handwriting, such as OCR.
Some applications involve recognizing not just single digits or letters, but numbers and words. Sometimes a dataset of specifc words or ranges of numbers is needed. This simple script can be used to generate such a dataset.

Usage

To generate clean images of handwritten day/month/year from the MNIST dataset, try

python simulation.py --dir ~/Desktop/mnist --data date --num 10

Example image: year13.png

For noisier images, try

python simulation.py --dir ~/Desktop/mnist --speckle_noise --resize --underline_noise --data date --num 10

Example image: year13_noisy.png

To generate first and last names, try

python simulation.py --dir ~/Desktop/mnist --speckle_noise --resize --underline_noise --spacing 0.7 --data name --num 10

Get an explanation of the command line arguments using

python simulation.py -h

About

Generate handwriting training data by composing characters: numbers using MNIST or names using EMNIST

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages