A deep SNN approach to the OCR of Chinese calligraphy
CC-OCR is a novel deep-convolutional Siamese Neural Network (SNN) based architecture that finishes training in one day on a single RTX 2080 Ti graphics card while reaching an accuracy of 95.0% on a vast set of more than 6000 Chinese characters. The algorithm (i) trains with small numbers of samples, achieving few-/one-shot learning (ii) recognizes a new character that is not present in the training set and (iii) includes the new character in the dictionary so the dictionary is continually expanded.
Download the Sample dataset, unzip it and execute the run.py file to kick start the program.
Sample dataset contains a small extraction from the full dataset. Use it to quickly go through the code.
We will soon publish the corresponding research article.