SPARSE QUASI-NEWTON OPTIMIZATION FOR SEMI-SUPERVISED SUPPORT VECTOR MACHINES

Fabian Gieseke, Antti Airola, Tapio Pahikkala, Oliver Kramer

2012

Abstract

In real-world scenarios, labeled data is often rare while unlabeled data can be obtained in huge quantities. A current research direction in machine learning is the concept of semi-supervised support vector machines. This type of binary classification approach aims at taking the additional information provided by unlabeled patterns into account to reveal more information about the structure of the data and, hence, to yield models with a better classification performance. However, generating these semi-supervised models requires solving difficult optimization tasks. In this work, we present a simple but effective approach to address the induced optimization task, which is based on a special instance of the quasi-Newton family of optimization schemes. The resulting framework can be implemented easily using black box optimization engines and yields excellent classification and runtime results on both artificial and real-world data sets that are superior (or at least competitive) to the ones obtained by competing state-of-the-art methods.

References

  1. Adankon, M., Cheriet, M., and Biem, A. (2009). Semisupervised least squares support vector machine. IEEE Transactions on Neural Networks, 20(12):1858-1870.
  2. Bennett, K. P. and Demiriz, A. (1999). Semi-supervised support vector machines. In Adv. in Neural Information Proc. Systems 11, pages 368-374. MIT Press.
  3. Bie, T. D. and Cristianini, N. (2004). Convex methods for transduction. In Adv. in Neural Information Proc. Systems 16, pages 73-80. MIT Press.
  4. Byrd, R. H., Byrd, R. H., Lu, P., Lu, P., Nocedal, J., Nocedal, J., Zhu, C., and Zhu, C. (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing, 16(5):1190-1208.
  5. Chang, C.-C. and Lin, C.-J. (2001). LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm.
  6. Chapelle, O., Chi, M., and Zien, A. (2006a). A continuation method for semi-supervised SVMs. In Proc. Int. Conf. on Mach. Learn., pages 185-192.
  7. Chapelle, O., Schölkopf, B., and Zien, A., editors (2006b). Semi-Supervised Learning. MIT Press, Cambridge, MA.
  8. Chapelle, O., Sindhwani, V., and Keerthi, S. S. (2008). Optimization techniques for semi-supervised support vector machines. Journal of Mach. Learn. Res., 9:203-233.
  9. Chapelle, O. and Zien, A. (2005). Semi-supervised classification by low density separation. In Proc. Tenth Int. Workshop on Art. Intell. and Statistics, pages 57-64.
  10. Collobert, R., Sinz, F., Weston, J., and Bottou, L. (2006). Trading convexity for scalability. In Proc. Int. Conf. on Mach. Learn., pages 201-208.
  11. Fung, G. and Mangasarian, O. L. (2001). Semi-supervised support vector machines for unlabeled data classification. Optimization Methods and Software, 15:29-44.
  12. Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning. Springer.
  13. Joachims, T. (1999). Transductive inference for text classification using support vector machines. In Proc. Int. Conf. on Mach. Learn., pages 200-209.
  14. Mierswa, I. (2009). Non-convex and multi-objective optimization in data mining. PhD thesis, Technische Universität Dortmund.
  15. Nene, S., Nayar, S., and Murase, H. (1996). Columbia object image library (coil-100). Technical report.
  16. Nocedal, J. and Wright, S. J. (2000). Numerical Optimization. Springer, 1 edition.
  17. Reddy, I. S., Shevade, S., and Murty, M. (2010). A fast quasi-Newton method for semi-supervised SVM. Pattern Recognition, In Press, Corrected Proof.
  18. Rifkin, R. M. (2002). Everything Old is New Again: A Fresh Look at Historical Approaches in Machine Learning. PhD thesis, MIT.
  19. Schölkopf, B., Herbrich, R., and Smola, A. J. (2001). A generalized representer theorem. In Helmbold, D. P. and Williamson, B., editors, Proc. 14th Annual Conf. on Computational Learning Theory, pages 416-426.
  20. Sindhwani, V., Keerthi, S., and Chapelle, O. (2006). Deterministic annealing for semi-supervised kernel machines. In Proc. Int. Conf. on Mach. Learn., pages 841-848.
  21. Sindhwani, V. and Keerthi, S. S. (2006). Large scale semisupervised linear SVMs. In Proc. 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 477-484, New York, NY, USA. ACM.
  22. Steinwart, I. and Christmann, A. (2008). Support Vector Machines. Springer, New York, NY, USA.
  23. Vapnik, V. and Sterin, A. (1977). On structural risk minimization or overall risk in a problem of pattern recognition. Aut. and Remote Control, 10(3):1495-1503.
  24. Xu, L. and Schuurmans, D. (2005). Unsupervised and semisupervised multi-class support vector machines. In Proc. National Conf. on Art. Intell., pages 904-910.
  25. Zhang, K., Kwok, J. T., and Parvin, B. (2009). Prototype vector machine for large scale semi-supervised learning. In Proc. of the Int. Conf. on Mach. Learn., pages 1233-1240.
  26. Zhang, T. and Oles, F. J. (2001). Text categorization based on regularized linear classification methods. Information Retrieval, 4:5-31.
  27. Zhao, B., Wang, F., and Zhang, C. (2008). Cuts3vm: A fast semi-supervised svm algorithm. In Proc. 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, pages 830-838.
  28. Zhu, X. and Goldberg, A. B. (2009). Introduction to SemiSupervised Learning. Morgan and Claypool.
Download


Paper Citation


in Harvard Style

Gieseke F., Airola A., Pahikkala T. and Kramer O. (2012). SPARSE QUASI-NEWTON OPTIMIZATION FOR SEMI-SUPERVISED SUPPORT VECTOR MACHINES . In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-8425-98-0, pages 45-54. DOI: 10.5220/0003755300450054


in Bibtex Style

@conference{icpram12,
author={Fabian Gieseke and Antti Airola and Tapio Pahikkala and Oliver Kramer},
title={SPARSE QUASI-NEWTON OPTIMIZATION FOR SEMI-SUPERVISED SUPPORT VECTOR MACHINES},
booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2012},
pages={45-54},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0003755300450054},
isbn={978-989-8425-98-0},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - SPARSE QUASI-NEWTON OPTIMIZATION FOR SEMI-SUPERVISED SUPPORT VECTOR MACHINES
SN - 978-989-8425-98-0
AU - Gieseke F.
AU - Airola A.
AU - Pahikkala T.
AU - Kramer O.
PY - 2012
SP - 45
EP - 54
DO - 10.5220/0003755300450054