Learning Semantic Attributes via a Common Latent Space

Ziad Al-Halah, Tobias Gehrig, Rainer Stiefelhagen

2014

Abstract

Semantic attributes represent an adequate knowledge that can be easily transferred to other domains where lack of information and training samples exist. However, in the classical object recognition case, where training data is abundant, attribute-based recognition usually results in poor performance compared to methods that used image features directly. We introduce a generic framework that boosts the performance of semantic attributes considerably in traditional classification and knowledge transfer tasks, such as zero-shot learning. It incorporates the discriminative power of the visual features and the semantic meaning of the attributes by learning a common latent space that joins both spaces. We also specifically account for the presence of attribute correlations in the source dataset to generalize more efficiently across domains. Our evaluation of the proposed approach on standard public datasets shows that it is not only simple and computationally efficient but also performs remarkably better than the common direct attribute model.

References

  1. Chang, C.-C. and Lin, C.-J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1-27:27.
  2. Comon, P. (1994). Independent component analysis, A new concept? Signal Processing, 36(3):287-314.
  3. de Jong, S. (1993). SIMPLS: An alternative approach to partial least squares regression. Chemometrics and Intelligent Laboratory Systems, 18(3):251-263.
  4. Duan, K., Parikh, D., Carndall, D., and Grauman, K. (2012). Discovering localized attributes for finegrained recognition. In CVPR.
  5. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. (2008). The PASCAL Visual Object Classes Challenge 2008 (VOC2008) Results.
  6. Farhadi, A., Endres, I., and Hoiem, D. (2010). AttributeCentric Recognition for Cross-category Generalization. In CVPR.
  7. Farhadi, A., Endres, I., Hoiem, D., and Forsyth, D. (2009). Describing Objects by their Attributes. In CVPR.
  8. Ferrari, V. and Zisserman, A. (2008). Learning Visual Attributes. In NIPS.
  9. Fu, Y., Hospedales, T. M., Xiang, T., and Gong, S. (2012). Attribute Learning for Understanding Unstructured Social Activity. In ECCV.
  10. Gehrig, T. and Ekenel, H. K. (2011). Facial Action Unit Detection Using Kernel Partial Least Squares. In 1st IEEE Int'l Workshop on Benchmarking Facial Image Analysis Technologies (BeFIT 2011).
  11. Guo, G. and Mu, G. (2011). Simultaneous Dimensionality Reduction and Human Age Estimation via Kernel Partial Least Squares Regression. In CVPR.
  12. Haj, M. A., Gonzàles, J., and Davis, L. S. (2012). On Partial Least Squares in Head Pose Estimation: How to simultaneously deal with misalignment. In CVPR, Providence, RI, USA.
  13. Hyvärinen, A. and Oja, E. (2000). Independent component analysis: algorithms and applications. Neural networks : the official journal of the International Neural Network Society, 13(4-5):411-30.
  14. Kumar, N., Berg, A., Belhumeur, P. N., and Nayar, S. (2011). Describable Visual Attributes for Face Verification and Image Search. In PAMI, pages 1962-1977.
  15. Lampert, C., Nickisch, H., and Harmeling, S. (2009). Learning to detect unseen object classes by betweenclass attribute transfer. In CVPR.
  16. Liu, J., Kuipers, B., and Savarese, S. (2011). Recognizing Human Actions by Attributes. In CVPR.
  17. Parikh, D. and Grauman, K. (2011). Relative Attributes. In ICCV.
  18. Rosipal, R. and Krämer, N. (2006). Overview and recent advances in partial least squares. In Saunders, C., Grobelnik, M., Gunn, S., and Shawe-Taylor, J., editors, Subspace, Latent Structure and Feature Selection, pages 34-51. Springer.
  19. Schwartz, W., Guo, H., and Davis, L. (2010). A robust and scalable approach to face identification. In ECCV. Springer.
  20. Schwartz, W. R. and Davis, L. S. (2009). Learning discriminative appearance-based models using partial least squares. In XXII Brazilian Symposium on Computer Graphics and Image Processing, pages 322-329.
  21. Schwartz, W. R., Kembhavi, A., Harwood, D., and Davis, L. S. (2009). Human detection using partial least squares analysis. In ICCV.
  22. Sharma, A. and Jacobs, D. (2011). Bypassing Synthesis: PLS for Face Recognition with Pose, Low-Resolution and Sketch. In CVPR.
  23. Wang, G. and Forsyth, D. (2009). Joint learning of visual attributes, object classes and visual saliency. In ICCV.
  24. Wang, Y. and Mori, G. (2010). A Discriminative Latent Model of Object Classes and Attributes. In ECCV.
  25. Yao, B., Jiang, X., Khosla, A., Lin, A. L., Guibas, L., and Fei-Fei, L. (2011). Human action recognition by learning bases of action attributes and parts. In ICCV.
Download


Paper Citation


in Harvard Style

Al-Halah Z., Gehrig T. and Stiefelhagen R. (2014). Learning Semantic Attributes via a Common Latent Space . In Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014) ISBN 978-989-758-004-8, pages 48-55. DOI: 10.5220/0004681500480055


in Bibtex Style

@conference{visapp14,
author={Ziad Al-Halah and Tobias Gehrig and Rainer Stiefelhagen},
title={Learning Semantic Attributes via a Common Latent Space},
booktitle={Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)},
year={2014},
pages={48-55},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0004681500480055},
isbn={978-989-758-004-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 9th International Conference on Computer Vision Theory and Applications - Volume 2: VISAPP, (VISIGRAPP 2014)
TI - Learning Semantic Attributes via a Common Latent Space
SN - 978-989-758-004-8
AU - Al-Halah Z.
AU - Gehrig T.
AU - Stiefelhagen R.
PY - 2014
SP - 48
EP - 55
DO - 10.5220/0004681500480055