font size is assumed to be 80 pixels, n
te
= 1188. The
document is generated (.doc) with all the letters from
the test sample, then this document is printed. Then
the obtained samples are scanned with a resolution of
300 dpi. That is, the images are of lower quality than
in the previous case (see Fig. 11).
Figure 11: Example letter from the input image
The results are presented in the table 4:
Table 4: The results of the two methods.
SA CNN
Quality, Q 0.95538 0.94696
Refusal rate 0.01263 0
6.8 Analysis of Experiments
The experiments show that: the quality of the propo-
sed method is not worse than the selected basic algo-
rithm and it has a small proportion of refuses from the
classification, which increases with the deterioration
of image quality.
7 CONCLUSIONS
This paper proposes a formalization of the concept of
”grapheme”, namely a mathematical model of grap-
heme.
On the basis of this model, a method of genera-
ting features used for the subsequent construction of
the algorithm of classification of images of letters is
proposed (that is, the measure of similarity between
mathematical models of graphs is determined). Also
in this article the algorithm of recognition of the text
on the image is proposed.
The advantages of the proposed letters recognition
method: independence from the size, type of font and
type of lettering; allocation of the general structure
(mathematical model of grapheme) for letters, which
is enough to recognize letters in new fonts; interpre-
tability of features.
The disadvantages of the method: the presence of
refuses of classification and the dependence of the re-
cognition quality from the quality of the binarization
of the image.
The experiments confirm that the proposed mat-
hematical model of the grapheme has shown its effi-
ciency.
The objectives of further research are:
1. Improvement of top-level and bottom-level featu-
res.
2. Solution to the problem of classification refuses.
3. Modification of the iterative part (postprocessing)
of the classification algorithm.
ACKNOWLEDGEMENTS
The work was funded by Russian Foundation of Basic
Research grant No. 17- 01-00917.
REFERENCES
Bishop, C. (2006). Pattern recognition and machine lear-
ning. Springer.
FontsDatabase (2018). https://www.fontsquirrel.com/.
Glorot, X., Bordes, A., and Bengio, Y. (2011). Deep sparse
rectifier neural networks. In Proceedings of the four-
teenth international conference on artificial intelli-
gence and statistics, pages 315–323.
Hausdorff, F. (1965). Grundz
¨
uge der mengenlehre (reprint;
originally published in leipzig in 1914). Chelsea, New
York.
Ho, T. K. (1995). Random decision forests. In Docu-
ment analysis and recognition, 1995., proceedings of
the third international conference on, volume 1, pages
278–282. IEEE.
Jarrett, K., Kavukcuoglu, K., LeCun, Y., et al. (2009). What
is the best multi-stage architecture for object recogni-
tion? In Computer Vision, 2009 IEEE 12th Internati-
onal Conference on, pages 2146–2153. IEEE.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998).
Gradient-based learning applied to document recogni-
tion. Proceedings of the IEEE, 86(11):2278–2324.
Lin, M., Chen, Q., and Yan, S. (2013). Network in network.
arXiv preprint arXiv:1312.4400.
Mestetskiy, L. M. (2009). Continuous morphology of bi-
nary images: figures, skeletons, circulars (In Rus-
sian). FIZMATLIT.
Osetrova, O. V. (2006). Semiotics of the font (In Rus-
sian). Bulletin of Voronezh state University. Se-
ries:Philology. Journalism.
Otsu, N. (1979). A threshold selection method from gray-
level histograms. IEEE transactions on systems, man,
and cybernetics, 9(1):62–66.
ParaType (2008). Digital Fonts (In Russian). ParaType.
Solomonik, A. (2017). About language and languages (In
Russian). Publishing House ’Sputnik+’.
Zaliznyak, A. A. (2002). Russian nominal inflection by ap-
plication of selected works on modern Russian lan-
guage and General linguistics (In Russian). languages
of Slavic culture.
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications
358