Two-way Multimodal Online Matrix Factorization for Multi-label Annotation

Jorge A. Vanegas, Viviana Beltran, Fabio A. González


This paper presents a matrix factorization algorithm for multi-label annotation. The multi-label annotation problem arises in situations such as object recognition in images where we want to automatically find the objects present in a given image. The solution consists in learning a classification model able to assign one or many labels to a particular sample. The method presented in this paper learns a mapping between the features of the input sample and the labels, which is later used to predict labels for unannotated instances. The mapping between the feature representation and the labels is found by learning a common semantic representation using matrix factorization. An important characteristic of the proposed algorithm is its online formulation based on stochastic gradient descent which can scale to deal with large datasets. According to the experimental evaluation, which compares the method with state-of-the-art space embedding algorithms, the proposed method presents a competitive performance improving, in some cases, previously reported results.


  1. Akata, Z., Thurau, C., and Bauckhage, C. (2011). Nonnegative matrix factorization in multimodality data for segmentation and label prediction. In 16th Computer Vision Winter Workshop.
  2. Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu, R., Desjardins, G., Turian, J., Warde-Farley, D., and Bengio, Y. (2010). Theano: a CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy). Oral Presentation.
  3. Caicedo, J. C., BenAbdallah, J., González, F. A., and Nasraoui, O. (2012). Multimodal representation, indexing, automated annotation and retrieval of image collections via non-negative matrix factorization. Neurocomputing, 76(1):50-60.
  4. Cotter, A., Shamir, O., Srebro, N., and Sridharan, K. (2011). Better mini-batch algorithms via accelerated gradient methods. CoRR, abs/1106.4574.
  5. Goodfellow, I. J., Warde-Farley, D., Lamblin, P., Dumoulin, V., Mirza, M., Pascanu, R., Bergstra, J., Bastien, F., and Bengio, Y. (2013). Pylearn2: a machine learning research library. CoRR, abs/1308.4214.
  6. Hsu, D., Kakade, S. M., Langford, J., and Zhang, T. (2009).
  7. Multi-label prediction via compressed sensing. CoRR, abs/0902.1284.
  8. Otálora-Montenegro, S., Pérez-Rubiano, S. A., and González, F. A. (2013). Online matrix factorization for space embedding multilabel annotation. In RuizShulcloper, J. and di Baja, G. S., editors, CIARP (1), volume 8258 of Lecture Notes in Computer Science, pages 343-350. Springer.
  9. Qi, Z., Yang, M., Zhang, Z. M., and Zhang, Z. (2012). Mining noisy tagging from multi-label space. In Proceedings of the 21st ACM International Conference on Information and Knowledge Management, CIKM 7812, pages 1925-1929, New York, NY, USA. ACM.
  10. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1. In Rumelhart, D. E., McClelland, J. L., and PDP Research Group, C., editors, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1, chapter Learning Internal Representations by Error Propagation, pages 318-362. MIT Press, Cambridge, MA, USA.
  11. Siddiquie, B., Feris, R. S., and Davis, L. S. (2011). Image ranking and retrieval based on multi-attribute queries. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 7811, pages 801-808, Washington, DC, USA. IEEE Computer Society.
  12. Sun, L., Ji, S., and Ye, J. (2011). Canonical correlation analysis for multilabel classification: A least-squares formulation, extensions, and analysis. IEEE Trans. Pattern Anal. Mach. Intell., 33(1):194-200.
  13. Tai, F. and Lin, H.-T. (2012). Multilabel classification with principal label space transformation. Neural Comput., 24(9):2508-2542.
  14. Trohidis, K., Tsoumakas, G., Kalliris, G., and Vlahavas, I. P. (2008). Multi-label classification of music into emotions. In Bello, J. P., Chew, E., and Turnbull, D., editors, ISMIR, pages 325-330.
  15. Tsai, M.-H., Wang, J., Zhang, T., Gong, Y., and Huang, T. S. (2011). Learning semantic embedding at a large scale. In ICIP, pages 2497-2500.
  16. Tsoumakas, G. and Katakis, I. (2007). Multi label classification: An overview. International Journal of Data Warehouse and Mining, 3(3):1-13.
  17. Tsoumakas, G., Spyromitros-Xioufis, E., Vilcek, J., and Vlahavas, I. (2011). Mulan: A java library for multilabel learning. Journal of Machine Learning Research, 12:2411-2414.
  18. Wang, J., Zhao, Y., Wu, X., and Hua, X.-S. (2008). Transductive multi-label learning for video concept detection. In Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval, MIR 7808, pages 298-304, New York, NY, USA. ACM.
  19. Weston, J., Bengio, S., and Usunier, N. (2010). Large scale image annotation: Learning to rank with joint wordimage embeddings. In ECML.
  20. Wu, F., Han, Y., Tian, Q., and Zhuang, Y. (2010). Multilabel boosting for image annotation by structural grouping sparsity. In ACM Multimedia, pages 15-24.
  21. Zhang, M.-L. and Zhou, Z.-H. (2006). Multilabel neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 18(10):1338-1351.

Paper Citation

in Harvard Style

Vanegas J., Beltran V. and A. González F. (2015). Two-way Multimodal Online Matrix Factorization for Multi-label Annotation . In Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-076-5, pages 279-285. DOI: 10.5220/0005209602790285

in Bibtex Style

author={Jorge A. Vanegas and Viviana Beltran and Fabio A. González},
title={Two-way Multimodal Online Matrix Factorization for Multi-label Annotation},
booktitle={Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

in EndNote Style

JO - Proceedings of the International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - Two-way Multimodal Online Matrix Factorization for Multi-label Annotation
SN - 978-989-758-076-5
AU - Vanegas J.
AU - Beltran V.
AU - A. González F.
PY - 2015
SP - 279
EP - 285
DO - 10.5220/0005209602790285