context. Currently, thousands of computer fonts
developed.The need to identify what font is used in
the text arises for designers, font developers and
copyright holder companies. The aim of the
experiment is to evaluate the possibilities of using
the proposed method for solving these problems.
Example (Fig.12a) demonstrates width
diagrams for 5 letters of the Times New Roman
font, belonging to the word HORSE. The example
shows that the font characters have clearly
distinguishable individual portraits.
(a)
(b)
Figure 12: Width diagrams of different characters of the
same font (a) and the same character in different fonts (b).
Differences between the portraits of the same
letter H, typed by different fonts (Times New
Roman, Aria, Garamond, Britannic Bold,
Rockwell) are shown in next example (Fig.12b).
These diagrams are obtained for high-resolution
images, which are considered as reference samples.
To conduct the experiment under more realistic
conditions, reference images of 52 characters of the
Latin alphabet (26 uppercase and 26 lowercase
letters) for 1848 typefaces ParaType digital font
collection (Yakupov et al., 2015) have been
constructed. For the reference images the width
diagrams were obtained by the method described in
this article. To do this, each character was drawn
on a binary raster image on such a scale that the
height of a capital letter H was 1000 pixels. For
these images continuous skeletons were
constructed and their basis width histograms were
calculated with the radius step of 0.5 pixel.
For the same fonts the images of the characters
were obtained in a lower resolution, so that the
height of letter H was 100 and 70 pixels. For these
characters, width diagrams also were built. Step
radius in the calculation was 0.05 and 0.035 pixels,
respectively. These diagrams were normalized so
that they could be compared with the diagrams of
reference font characters. Normalization was done
by stretching the diagrams 10 times along the -
axis and 100 times along the -axis and 14.29
times along the -axis and 204.08 times along the
-axis for low resolutions of 100 and 70 pixels
respectively. As a result, all the normalized
diagrams used the same set of radius values.
Creating of the skeletons and the calculation of
width diagrams (for 52 glyphs of 1848 fonts) took
in total less than 4 minutes on the computer with
Intel® Core i5
TM
processor and 6GB of RAM
Further, for each font images of the 1000
common English words, random 30% of which were
converted to upper case, were composed from the
letters in low resolution. These images were used as
the test set. Next, the diagrams of the letters on test
images were compared with the diagrams of
reference images in
metric. As an integral font
similarity metric we use a linear combination of
distances between all characters present in the
word. The coefficients of the linear form for each
word were obtained by training on the entire set of
test fonts. In the experiment, we calculated the
distances for 52 letters between all pairs of 1848
typefaces, which took 18 minutes, and 1000 times
trained the linear form, which took 32 minutes.
This means that the time of the request – checking
the typeface in the basis of the references – is 2
seconds and most of this time is spent to training of
the linear form.
The experimental results showed that the font
recognition accuracy by one word at the resolution
of 100 was 91%, and at a resolution of 70 – more
than 81%. Using the imaginary word containing all
52 characters we achieved the accuracy of 97% and
95% respectively.
Thus, the experiment confirmed the efficiency
of the proposed method and showed its efficiency
on the practical task of comparing a large number
of images (1848184852) with a fairly high
recognition quality.
9 CONCLUSION
The proposed approach opens up new possibilities
for the use of highly efficient computational
geometry algorithms in image analysis and shape
recognition. The continuous model of width of
polygonal figures on the basis of the disk cover
allowed to make the decomposition of the original