BLSTM-CTC Combination Strategies for Off-line Handwriting Recognition

Luc Mioulet, G. Bideault, C. Chatelain, T. Paquet, S. Brunessaux

Abstract

In this paper we present several combination strategies using multiple BLSTM-CTC systems. Given several feature sets our aim is to determine which strategies are the most relevant to improve on an isolated word recognition task (the WR2 task of the ICDAR 2009 competition), using a BLSTM-CTC architecture. We explore different combination levels: early integration (feature combination), mid level combination and late fusion (output combinations). Our results show that several combinations outperform single feature BLSTM-CTCs.

References

  1. Ait-Mohand, K., Paquet, T., and Ragot, N. (2014). Combining structure and parameter adaptation of HMMs for printed text recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, (99).
  2. Bengio, Y. (2009). Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1):1- 127.
  3. Bengio, Y., De Mori, R., Flammia, G., and Kompe, R. (1992). Global optimization of a neural networkhidden Markov model hybrid. IEEE Transactions on Neural Networks, 3(2):252-259.
  4. Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford University Press.
  5. Bunke, H., Bengio, S., and Vinciarelli, A. (2004). Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6):709-720.
  6. Dalal, N. and Triggs, B. (2005). Histograms of Oriented Gradients for Human Detection. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1:886-893.
  7. El-Yacoubi, A., Bertille, J., and Gilloux, M. (1995). Conjoined location and recognition of street names within a postal address delivery line. Proceedings of the Third International Conference on Document Analysis and Recognition, 2:1024-1027.
  8. El-Yacoubi, A., Gilloux, M., Sabourin, R., and Suen, C. Y. (1999). An HMM-based approach for off-line unconstrained handwritten word modeling and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(8):752-760.
  9. Gehler, P. and Nowozin, S. (2009). On feature combination for multiclass object classification. In IEEE International Conference on Computer Vision, pages 221- 228. IEEE.
  10. Gers, F. A. and Schraudolph, N. N. (2002). Learning Precise Timing with LSTM Recurrent Networks. Journal of Machine Learning Research, 3:115-143.
  11. Graves, A. (2008). Supervised sequence labelling with recurrent neural networks. PhD thesis.
  12. Graves, A. and Gomez, F. (2006). Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd International Conference on Machine Learning.
  13. Graves, A., Liwicki, M., Bunke, H., Santiago, F., and Schmidhuber, J. (2008). Unconstrained on-line handwriting recognition with recurrent neural networks. Advances in Neural Information Processing Systems, 20:1-8.
  14. Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., and Schmidhuber, J. (2009). A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(5):855-68.
  15. Grosicki, E. and Abed, H. E. (2009). ICDAR 2009 Handwriting Recognition Competition. In 10th Interational Conference on Document Analysis and Recognition, pages 1398-1402.
  16. Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8):1735-1780.
  17. Knerr, S., Anisimov, V., Barret, O., Gorski, N., Price, D., and Simon, J. (1997). The A2iA intercheque system: courtesy amount and legal amount recognition for French checks. International journal of pattern recognition and artificial intelligence, 11(4):505-548.
  18. Kundu, A., He, Y., and Bahl, P. (1988). Recognition of handwritten word: first and second order hidden Markov model based approach. Computer Vision and Pattern Recognition, 22(3):457-462.
  19. Lee, A., Kawahara, T., and Shikano, K. (2001). Julius an Open Source Real-Time Large Vocabulary Recognition Engine. In Eurospeech, pages 1691-1694.
  20. Menasri, F., Louradour, J., Bianne-Bernard, A., and Kermorvant, C. (2012). The A2iA French handwriting recognition system at the Rimes-ICDAR2011 competition. Society of Photo-Optical Instrumentation Engineers, 8297:51.
  21. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., and Vesely, K. (2011). The kaldi speech recognition toolkit. In IEEE workshop on Automatic Speech Recognition and Understanding, pages 1-4.
  22. Rabiner, L. (1989). A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77(2):257-286.
  23. Vincent, P., Larochelle, H., Yoshua, B., and Manzagol, P. A. (2008). Extracting and composing robust features with denoising autoencoders. In Proceedings of the Twenty-fifth International Conference on Machine Learning, number July, pages 1096-1103.
  24. Vinciarelli, A. (2002). A survey on off-line cursive word recognition. Pattern recognition, 35(7):1433-1446.
  25. Vinciarelli, A. and Luettin, J. (2001). A new normalization technique for cursive handwritten words. Pattern Recognition Letters, 22(9):1043-1050.
  26. Werbos, P. J. (1990). Backpropagation through time: What it does and how to do it. Proceedings of the IEEE, 78(10):1550--1560.
  27. Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., and Woodland, P. (2006). The HTK book.
Download


Paper Citation


in Harvard Style

Mioulet L., Bideault G., Chatelain C., Paquet T. and Brunessaux S. (2015). BLSTM-CTC Combination Strategies for Off-line Handwriting Recognition . In Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-076-5, pages 173-180. DOI: 10.5220/0005178601730180


in Bibtex Style

@conference{icpram15,
author={Luc Mioulet and G. Bideault and C. Chatelain and T. Paquet and S. Brunessaux},
title={BLSTM-CTC Combination Strategies for Off-line Handwriting Recognition},
booktitle={Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2015},
pages={173-180},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005178601730180},
isbn={978-989-758-076-5},
}


in EndNote Style

TY - CONF
JO - Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - BLSTM-CTC Combination Strategies for Off-line Handwriting Recognition
SN - 978-989-758-076-5
AU - Mioulet L.
AU - Bideault G.
AU - Chatelain C.
AU - Paquet T.
AU - Brunessaux S.
PY - 2015
SP - 173
EP - 180
DO - 10.5220/0005178601730180