hidden layers. We also propose to use D-AR network
for multiple-output regression task. The discrimina-
tive and predictive performance of the features ex-
tracted with the proposed nonlinear D-AR network is
compared to that of linear D-AR, CCA and KCCA al-
gorithms. We use random forest algorithm as the base
classifier. Experimental results on publicly available
emotion recognition and residential building dataset
show that the features of the nonlinear D-AR network
give significantly higher accuracies and less errors
than that of KCCA on classification and regression
problems, respectively. Another important finding is
that although KCCA explores highly correlated co-
variates on the training set, all versions of the D-AR
network have higher correlations on the test set than
CCA and KCCA, which is in parallel with the test
set performances obtained on the supervised learning
tasks.
As a future research direction, advanced regu-
larization techniques can be applied to both KCCA
and the proposed network to improve their robust-
ness against outliers. The robustness of KCCA can
be improved using a reduced kernel method while the
proposed method can be improved using weight de-
cay mechanism or another backpropagation algorithm
such as resilient backpropagation with weight back-
tracking.
ACKNOWLEDGEMENTS
This research has been supported by Turkish
Scientific and Technological Research Council
(TUBITAK) project 215E008.
REFERENCES
Akaho, S. (2001). A kernel method for canonical correla-
tion analysis. In In Proceedings of the International
Meeting of the Psychometric Society (IMPS2001.
Springer-Verlag.
Asuncion, A. and Newman, D. (2007). Uci machine learn-
ing repository. irvine, ca: University of california,
school of information and computer science. URL
[http://www. ics. uci. edu/ mlearn/MLRepository.
html].
Bach, F. R. and Jordan, M. I. (2003). Kernel independent
component analysis. J. Mach. Learn. Res., 3:1–48.
Biemann, F., Meinecke, F. C., Gretton, A., Rauch, A.,
Rainer, G., Logothetis, N. K., and Mller, K. R. (2010).
Temporal kernel cca and its application in multimodal
neuronal data analysis. 79.
Branco, J. A., Croux, C., Filzmoser, P., and Oliveira, M. R.
(2005). Robust canonical correlations: A comparative
study. Computational Statistics, 20(2):203–229.
Breiman, L. (2001). Random forests. Machine Learning,
45(1):5–32.
Cai, J. and Huang, X. (2017). Robust kernel canonical cor-
relation analysis with applications to information re-
trieval. Eng. Appl. Artif. Intell., 64(C):33–42.
Chen, J., Bushman, F. D., Lewis, J. D., Wu, G. D., and
Li, H. (2012). Structure-constrained sparse canonical
correlation analysis with an application to microbiome
data analysis. 14.
He, Y., Zhao, L., and Zou, C. (2005). Face recognition
based on pca/kpca plus cca. In Advances in Nat-
ural Computation, pages 71–74, Berlin, Heidelberg.
Springer Berlin Heidelberg.
Hotelling, H. (1992). Relations Between Two Sets of Vari-
ates, pages 162–190. Springer New York, New York,
NY.
Hsieh, W. W. (2000). Nonlinear canonical correlation anal-
ysis by neural networks. Neural Netw., vol. 13, no. 10,
pp. 10951105.
Huang, S. Y., Lee, M. H., and Hsiao, C. K. (2009). Non-
linear measures of association with kernel canonical
correlation analysis and applications. Journal of Sta-
tistical Planning and Inference, 139(7):2162 – 2174.
Karaali, A. (2012). Face detection and facial expression
recognition using moment invariants.
Lai, P. L. and Fyfe, C. (1998). Canonical correlation anal-
ysis using artificial neural networks. Proc. 6th Eur.
Symp. Artif. Neural Netw., Bruges, Belgium, Apr, pp.
363367.
Lee, Y. J. and Huang, S. Y. (2007). Reduced support vector
machines: A statistical theory. IEEE Transactions on
Neural Networks, 18(1):1–13.
Li, Y. and Shawe-Taylor, J. (2006). Using kcca
for japanese—english cross-language information re-
trieval and document classification. J. Intell. Inf. Syst.,
27(2):117–133.
LII, K. P. F. S. (1901). Liii. on lines and planes of closest fit
to systems of points in space. The London, Edinburgh,
and Dublin Philosophical Magazine and Journal of
Science, 2(11):559–572. PCA beginnings.
Lucey, P., Cohn, J. F., Kanade, T., Saragih, J. M., Am-
badar, Z., and Matthews, I. A. (2010). The extended
cohn-kanade dataset (ck+): A complete dataset for
action unit and emotion-specified expression. 2010
IEEE Computer Society Conference on Computer Vi-
sion and Pattern Recognition - Workshops, pages 94–
101.
Melzer, T., Reiter, M., and Bischof, H. (2001). Nonlinear
feature extraction using generalized canonical correla-
tion analysis. In Artificial Neural Networks — ICANN
2001, pages 353–360, Berlin, Heidelberg. Springer
Berlin Heidelberg.
Pezeshki, A., Azimi-Sadjadi, M. R., and Scharf, L. L.
(2003). A network for recursive extraction of canon-
ical coordinates. Neural Netw., vol. 16, nos. 56, pp.
801808.
Rafiei, M. and Adeli, H. (2015). Novel machine learning
model for estimation of sale prices of real estate units.
ASCE, Journal of Construction Engineering & Man-
agement, 142(2), 04015066.
DATA 2018 - 7th International Conference on Data Science, Technology and Applications
116