However it provided us with a natural way of generat-
ing fuzzy ground truth in the context of the ”anyOCR”
pipeline proposed in (Jenckel et al., 2016).
The scope of this paper only covered the feasibility
of non-expert annotations which was shown through
the use of synthetic data. Therefore we plan further
evaluation with real non-expert annotations. While
we have shown that training on fuzzy ground truth can
be beneficial in the area of historical documents, fur-
ther analysis on different scripts and document types
is needed as well.
In the future we plan to further explore the possibili-
ties of using fuzzy ground truth when training LSTM
networks, like ground truth options with different seg-
mentations like ”m” and ”in”. Another future fo-
cus will be on automatically generating fuzzy ground
truth. While the proposed method reduces the need
for language experts it is still costly to annotate the
data by hand.
ACKNOWLEDGEMENTS
This work was partially funded by the BMBF (Ger-
man Federal Ministry of Education and Research),
project Kallimachos (01UG1415C).
REFERENCES
Althoff, T., Borth, D., Hees, J., and Dengel, A. (2013).
Analysis and forecasting of trending topics in online
media streams. In Proceedings of the 21st ACM inter-
national conference on Multimedia, pages 907–916.
ACM.
Graves, A. (2012). Supervised sequence labelling. In Super-
vised Sequence Labelling with Recurrent Neural Net-
works, pages 5–13. Springer.
Graves, A., Fernndez, S., Gomez, F. J., and Schmidhuber,
J. (2006). Connectionist temporal classification: La-
belling Unsegmented Sequence Data with Recurrent
Neural Networks. In ICML’06.
Hochreiter, S. and Schmidhuber, J. (1997). Long short-term
memory. In Neural Computation, pages 1735–1780.
Jenckel, M., Bukhari, S. S., and Dengel, A. (2016). anyocr:
A sequence learning based ocr system for unlabeled
historical documents. In 23rd International Confer-
ence on Pattern Recognition (ICPR’16), Mexiko.
Karayil, T., Ul-Hasan, A., and Breuel, T. M. (2015). A
Segmentation-Free Approach for Printed Devanagari
Script Recognition. In ICDAR, Tunisia.
Reynolds, D. (2015). Gaussian mixture models. In Ency-
clopedia of biometrics, pages 827–832. Springer.
Simistira, F., Ul-Hasan, A., Papavassiliou, V., Gatos, B.,
Katsouros, V., and Liwicki, M. (2015). Recognition of
Historical Greek Polytonic Scripts Using LSTM Net-
works. In ICDAR, Tunisia.
T. M. Breuel, A. Ul-Hasan, M. Al Azawi, F. Shafait (2013).
High Performance OCR for Printed English and Frak-
tur using LSTM Networks. In ICDAR, Washington
D.C. USA.
Ul-Hasan, A., Ahmed, S. B., Rashid, S. F., Shafait, F., and
Breuel, T. M. Offline Printed Urdu Nastaleeq Script
Recognition with Bidirectional LSTM Networks. In
ICDAR’13, USA.
Wang, H., Klaser, A., Schmid, C., and Liu, C.-L. (2011).
Action recognition by dense trajectories. pages 3169–
3176. IEEE.
Werbos, P. (1990). Backpropagation through time: what
does it do and how to do it. In Proceedings of IEEE,
volume 78.
You, Q., Luo, J., Jin, H., and Yang, J. (2015). Robust im-
age sentiment analysis using progressively trained and
domain transferred deep networks. In CoRR, volume
abs/1509.06041.
Yousefi, M. R., Soheili, M. R., Breuel, T. M., and Stricker,
D. (2015). A Comparison of 1D and 2D LSTM Ar-
chitectures for Recognition of Handwritten Arabic. In
DRR-XXI, USA.
Impact of Training LSTM-RNN with Fuzzy Ground Truth
393