broader variety in the data set is covered. The loss
function does not concentrate on fitting to noisy parts
of the data, but has the capacity to concentrate on the
important structures of the data.
5 CONCLUSIONS
Fast dimensionality reduction methods are required
that are able to process huge data sets, and large
dimensions. With UNN regression we have fitted
well-known established regression technique into the
unsupervised setting for dimensionality reduction.
The two iterative UNN strategies are efficient meth-
ods to embed high-dimensional data into fixed one-
dimensional latent space. We have introduced two
iterative local variants that turned out to be perfor-
mant on test problems in first experimental analy-
ses. UNN 1 achieves lower DSREs, but UNN 2 is
slightly faster because of the multiplicative constants
of UNN 1. We concentrated on the employment of the
ε-insensitive loss, and its influence on the DSRE. It
could be observed that both iterative UNN regression
strategies could benefit from the ε-insensitive loss, in
particular the iterative variant UNN 2 could be im-
proved employing a loss with ε > 0. Obviously, local
optima can be avoided. The experimental results have
shown that this effect cannot only be observed for
low-dimensional data with noise, but also for high-
dimensional, i.e., the digits data set.
Our future work will concentrate on the analysis
of local optima of UNN embeddings, and on possible
extensions to guarantee global optimal solutions. This
work will include the analysis of stochastic global
search variants. Furthermore, the UNN strategies
will be extended to latent topologies with higher di-
mensionality. Another possible extension of UNN is
a continuous backward mapping from latent to data
space f : x → y employing a distance-weighted vari-
ant of KNN. A backward mapping can be used for
generating high-dimensional data based on sampling
in latent space.
REFERENCES
Carreira-Perpi
˜
n
´
an, M.
´
A. and Lu, Z. (2010). Parametric di-
mensionality reduction by unsupervised regression. In
Conference on Computer Vision and Pattern Recogni-
tion (CVPR), pages 1895–1902.
Cover, T. and Hart, P. (1967). Nearest neighbor pattern clas-
sification. IEEE Transactions on Information Theory,
13:21– 27.
Fix, E. and Hodges, J. (1951). Discriminatory analysis, non-
parametric discrimination: Consistency properties. 4.
Gieseke, F., Polsterer, K. L., Thom, A., Zinn, P., Bomanns,
D., Dettmar, R.-J., Kramer, O., and Vahrenhold, J.
(2010). Detecting quasars in large-scale astronomi-
cal surveys. In International Conference on Machine
Learning and Applications (ICMLA), pages 352–357.
Hastie, Y. and Stuetzle, W. (1989). Principal curves.
Journal of the American Statistical Association,
85(406):502–516.
Hull, J. (1994). A database for handwritten text recognition
research. IEEE Trans. on PAMI, 5(16):550–554.
Jolliffe, I. (1986). Principal component analysis. Springer
series in statistics. Springer, New York.
Klanke, S. and Ritter, H. (2007). Variants of unsupervised
kernel regression: General cost functions. Neurocom-
puting, 70(7-9):1289–1303.
Kramer, O. (2011). Dimensionalty reduction by unsuper-
vised nearest neighbor regression. In Proceedings of
the 10th International Conference on Machine Learn-
ing and Applications (ICMLA). IEEE, to appear.
Lawrence, N. D. (2005). Probabilistic non-linear principal
component analysis with gaussian process latent vari-
able models. Journal of Machine Learning Research,
6:1783–1816.
Meinicke, P. (2000). Unsupervised Learning in a General-
ized Regression Framework. PhD thesis, University of
Bielefeld.
Meinicke, P., Klanke, S., Memisevic, R., and Ritter, H.
(2005). Principal surfaces from unsupervised kernel
regression. IEEE Trans. Pattern Anal. Mach. Intell.,
27(9):1379–1391.
Pearson, K. (1901). On lines and planes of closest fit to
systems of points in space. Philosophical Magazine,
2(6):559–572.
Roweis, S. T. and Saul, L. K. (2000). Nonlinear dimension-
ality reduction by locally linear embedding. Science,
290:2323–2326.
Sch
¨
olkopf, B., Smola, A., and M
¨
uller, K.-R. (1998). Non-
linear component analysis as a kernel eigenvalue prob-
lem. Neural Computation, 10(5):1299–1319.
Smola, A. J., Mika, S., Sch
¨
olkopf, B., and Williamson,
R. C. (2001). Regularized principal manifolds. Jour-
nal of Machine Learning Research, 1:179–209.
Tan, S. and Mavrovouniotis, M. (1995). Reducing data di-
mensionality through optimizing neural network in-
puts. AIChE Journal, 41(6):1471–1479.
Tenenbaum, J. B., Silva, V. D., and Langford, J. C. (2000).
A global geometric framework for nonlinear dimen-
sionality reduction. Science, 290:2319–2323.
ICAART 2012 - International Conference on Agents and Artificial Intelligence
170