intensive. An analysis of two real data sets reveals
its classification performance to be comparable to
available regularized classification methods for high-
dimensional data.
In general, some regularized statistical methods
for the analysis of high-dimensional data have been
empirically observed to possess reasonable robust-
ness properties with respect to outlying measurement
in the data. To give an example, regularized means
has been observed to cause a certain local robust-
ness against small departures in the observed data
(Tibshirani et al., 2003). Regularization itself cannot
ensure robustness against serious outliers (Filzmoser
and Todorov, 2011) for continuous data. In the words
of robust statistics, regularization does not imply a ro-
bustness in terms of the breakdown point and regular-
ized LDA cannot replace robust classification proce-
dures with a high breakdown point (Kalina, 2012). It
remains an open problem to investigate systematically
the relationship between regularization and statisti-
cal robustness for continuous data. Another warning
should be given that there is no reason to suppose that
the optimal procedure for the regularized model will
perform well away from that model (Davies, 2014).
Alternative approaches could be formulated by
means of regularization requiring a certain level of
sparsity (Chen et al., 2012). Moreover, L
2
-LDA can
be derived in an alternative way as a Bayesian esti-
mator or as the optimal method by means of robust
optimization (Xanthopoulos et al., 2013).
As a future research, we plan to investigate suit-
able choices of the target matrix T and extend the
regularized Mahalanobis distance to the context of
cluster analysis. From the theoretical point of view,
robustness of LDA regularized in the L
1
-norm has
not been inspected as well as regularized versions of
the highly robust MWCD estimator (Kalina, 2012).
We plan to apply and compare regularized versions
of LDA to pattern recognition problems in the analy-
sis of 3D neuroimages of spontaneous brain activity.
There, we plan to exploit the new L
2
-LDA without the
usual sparseness assumption, allowing to choose T to
model the high correlation of neighboring voxels.
ACKNOWLEDGEMENTS
The work was financially supported by the Neuron
Fund for Support of Science. The work of J. Kalina
was supported by the grant GA13-17187S of the
Czech Science Foundation. The work of J. Duintjer
Tebbens was supported by the grant GA13-06684S of
the Czech Science Foundation.
REFERENCES
Barlow, J., Bosner, N., and Drmac, Z. (2005). A new stable
bidiagonal reduction algorithm. Linear Algebra and
its Applications, 397:35–84.
Chen, X., Kim, Y., and Wang, Z. (2012). Efficient mini-
max estimation of a class of high-dimensional sparse
precision matrices. IEEE Transactions on Signal Pro-
cessing, 60:2899–2912.
Davies, P. (2014). Data Analysis and Approximate Mod-
els: Model Choice, Location-Scale, Analysis of Vari-
ance, Nonparametric Regression and Image Analysis.
Chapman & Hall/CRC, Boca Raton.
Duintjer Tebbens, J. and Schlesinger, P. (2007). Improving
implementation of linear discriminant analysis for the
high dimension/small sample size problem. Compu-
tational Statistics & Data Analysis, 52:423–437.
Filzmoser, P. and Todorov, V. (2011). Review of robust mul-
tivariate statistical methods in high dimension. Ana-
lytica Chinica Acta, 705:2–14.
Guo, Y., Hastie, T., and Tibshirani, R. (2007). Regularized
discriminant analysis and its application in microar-
rays. Biostatistics, 8:86–100.
Haff, L. (1980). Empirical bayes estimation of the multi-
variate normal covariance matrix. Annals of Statistics,
1980:586–597.
Hastie, T., Tibshirani, R., and Friedman, J. (2008). The
elements of statistical learning. Springer, New York,
2nd edition.
Kalina, J. (2012). Highly robust statistical methods in med-
ical image analysis. Biocybernetics and Biomedical
Engineering, 32(2):3–16.
Kalina, J. (2014). Classification analysis methods for
high-dimensional genetic data. Biocybernetics and
Biomedical Engineering, 34:10–18.
Kalina, J. and Zv
´
arov
´
a, J. (2013). Decision support systems
in the process of improving patient safety. In E-health
Technologies and Improving Patient Safety: Explor-
ing Organizational Factors, pages 71–83. IGI Global,
Hershey.
Kogan, J. (2007). Introduction to clustering large and high-
dimensional data. Cambridge University Press, Cam-
bridge.
Pourahmadi, M. (2013). High-dimensional covariance es-
timation. Wiley, New York.
Sch
¨
afer, J. and Strimmer, K. (2005). A shrinkage approach
to large-scale covariance matrix estimation and impli-
cations for functional genomics. Statistical Applica-
tions in Genetics and Molecular Biology, 32:1–30.
Sreekumar et al., A. (2009). Metabolomic profiles delineate
potential role for sarcosine in prostate cancer progres-
sion. Nature, 457:910–914.
Stein, C. (1956). Inadmissibility of the usual estimator for
the mean of a multivariate normal distribution. Pro-
ceedings of the Third Berkeley Symposium on Mathe-
matical Statistics and Probability, 1:197–206.
Tibshirani, R., Hastie, T., and Narasimhan, B. (2003). Class
prediction by nearest shrunken centroids, with ap-
plications to dna microarrays. Statistical Science,
18:104–117.
Xanthopoulos, P., Pardalos, P., and Trafalis, T. (2013). Ro-
bust data mining. Springer, New York.
AlgorithmsforRegularizedLinearDiscriminantAnalysis
133