# ON UNSUPERVISED NEAREST-NEIGHBOR REGRESSION AND ROBUST LOSS FUNCTIONS

### Oliver Kramer

#### Abstract

In many scientific disciplines structures in high-dimensional data have to be detected, e.g., in stellar spectra, in genome data, or in face recognition tasks. We present an approach to non-linear dimensionality reduction based on fitting nearest neighbor regression to the unsupervised regression framework for learning of lowdimensional manifolds. The problem of optimizing latent neighborhoods is difficult to solve, but the UNN formulation allows an efficient strategy of iteratively embedding latent points to fixed neighborhood topologies. The choice of an appropriate loss function is relevant, in particular for noisy, and high-dimensional data spaces. We extend unsupervised nearest neighbor (UNN) regression by the e-insensitive loss, which allows to ignore residuals under a threshold defined by e. In the experimental part of this paper we test the influence of e on the final data space reconstruction error, and present a visualization of UNN embeddings on test data sets.

#### References

- Carreira-Perpin˜án, M. Í . and Lu, Z. (2010). Parametric dimensionality reduction by unsupervised regression. In Conference on Computer Vision and Pattern Recognition (CVPR), pages 1895-1902.
- Cover, T. and Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13:21- 27.
- Fix, E. and Hodges, J. (1951). Discriminatory analysis, nonparametric discrimination: Consistency properties. 4.
- Gieseke, F., Polsterer, K. L., Thom, A., Zinn, P., Bomanns, D., Dettmar, R.-J., Kramer, O., and Vahrenhold, J. (2010). Detecting quasars in large-scale astronomical surveys. In International Conference on Machine Learning and Applications (ICMLA), pages 352-357.
- Hastie, Y. and Stuetzle, W. (1989). Principal curves. Journal of the American Statistical Association, 85(406):502-516.
- Hull, J. (1994). A database for handwritten text recognition research. IEEE Trans. on PAMI, 5(16):550-554.
- Jolliffe, I. (1986). Principal component analysis. Springer series in statistics. Springer, New York.
- Klanke, S. and Ritter, H. (2007). Variants of unsupervised kernel regression: General cost functions. Neurocomputing, 70(7-9):1289-1303.
- Kramer, O. (2011). Dimensionalty reduction by unsupervised nearest neighbor regression. In Proceedings of the 10th International Conference on Machine Learning and Applications (ICMLA). IEEE, to appear.
- Lawrence, N. D. (2005). Probabilistic non-linear principal component analysis with gaussian process latent variable models. Journal of Machine Learning Research, 6:1783-1816.
- Meinicke, P. (2000). Unsupervised Learning in a Generalized Regression Framework. PhD thesis, University of Bielefeld.
- Meinicke, P., Klanke, S., Memisevic, R., and Ritter, H. (2005). Principal surfaces from unsupervised kernel regression. IEEE Trans. Pattern Anal. Mach. Intell., 27(9):1379-1391.
- Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 2(6):559-572.
- Roweis, S. T. and Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. SCIENCE, 290:2323-2326.
- Schölkopf, B., Smola, A., and Müller, K.-R. (1998). Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10(5):1299-1319.
- Smola, A. J., Mika, S., Schölkopf, B., and Williamson, R. C. (2001). Regularized principal manifolds. Journal of Machine Learning Research, 1:179-209.
- Tan, S. and Mavrovouniotis, M. (1995). Reducing data dimensionality through optimizing neural network inputs. AIChE Journal, 41(6):1471-1479.
- Tenenbaum, J. B., Silva, V. D., and Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290:2319-2323.

#### Paper Citation

#### in Harvard Style

Kramer O. (2012). **ON UNSUPERVISED NEAREST-NEIGHBOR REGRESSION AND ROBUST LOSS FUNCTIONS** . In *Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,* ISBN 978-989-8425-95-9, pages 164-170. DOI: 10.5220/0003749301640170

#### in Bibtex Style

@conference{icaart12,

author={Oliver Kramer},

title={ON UNSUPERVISED NEAREST-NEIGHBOR REGRESSION AND ROBUST LOSS FUNCTIONS},

booktitle={Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,},

year={2012},

pages={164-170},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0003749301640170},

isbn={978-989-8425-95-9},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 4th International Conference on Agents and Artificial Intelligence - Volume 1: ICAART,

TI - ON UNSUPERVISED NEAREST-NEIGHBOR REGRESSION AND ROBUST LOSS FUNCTIONS

SN - 978-989-8425-95-9

AU - Kramer O.

PY - 2012

SP - 164

EP - 170

DO - 10.5220/0003749301640170