# LINEAR PROJECTION METHODS - An Experimental Study for Regression Problems

### Carlos Pardo-Aguilar, José F. Diez-Pastor, Nicolás García-Pedrajas, Juan J. Rodríguez, César García-Osorio

#### Abstract

Two contexts may be considered, in which it is of interest to reduce the dimension of a data set. One of these arises when the intention is to mitigate the curse of dimensionality, when the data set will be used for training a data mining algorithm with a heavy computational load. The other is when one wishes to identify the data set attributes that have a stronger relation with either the class, if dealing with a classification problem, or the value to be predicted, if dealing with a regression problem. Recently, various linear regression projection models have been proposed that attempt to conserve those directions that show the highest correlation with the value to be predicted: Localized Slices Inverse Regression, Weighted Principal Component Analysis and Linear Discriminant Analysis for regression. However, the papers that have presented these methods use only a small number of data sets to validate their smooth functioning. In this research, a more exhaustive study is conducted using 30 data sets. Moreover, by applying the ideas behind these methods, a further three new methods are also presented and included in the comparative study; one of which is competitive with the methods recently proposed.

#### References

- Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7:1-30.
- Fisher, R. et al. (1936). The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2):179- 188.
- Frank, A. and Asuncion, A. (2010). UCI machine learning repository. Stable URL: http://archive.ics.uci.edu/ml/.
- Fukunaga, K. and Mantock, J. (1983). Nonparametric discriminant analysis. IEEE Transaction on Pattern Analysis and Machine Intelligence, 6(5):671-678.
- García-Pedrajas, N. and García-Osorio, C. (2011). Constructing ensembles of classifiers using supervised projection methods based on misclassified instances. Expert Systems with Applications, 38(1):343-359. DOI: 10.1016/j.eswa.2010.06.072.
- Guyon, I. and Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182.
- Jolliffe, I. (1986). Principal Component Analysis. SpringerVerlag.
- Kwak, N. and Lee, J.-W. (2010). Feature extraction based on subspace methods for regression problems. Neurocomputing, 73(10-12):1740-1751.
- Lee, J. A. and Verleysen, M. (2007). Nonlinear Dimensionality Reduction. Springer.
- Li, K.-C. (1991). Sliced inverse regression for dimension reduction. Journal of the American Statistical Association, 86(414):316-327.
- Li, K.-C. (1992). On principal hessian directions for data visualization and dimension reduction: Another application of stein's lemma. Journal of the American Statistical Association, 84(420):1025-1039. Stable URL: http://www.jstor.org/stable/229064.
- Li, K. C. (2000). High dimensional data analysis via the SIR/PHD approach. Available at http://www.stat.ucla.edu/ kcli/sir-PHD.pdf.
- Liu, H. and Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transanction on Knowledge and Data Engineering, 17:491-502.
- Rodríguez, J. J., Kuncheva, L. I., and Alonso, C. J. (2006). Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1619-1630.
- Roweis, S. T. and Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323-2326.
- Tenenbaum, J. B., de Silva, V., and Langford, J. C. (2000). A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290(5500):2319- 2323.
- Tian, Q., Yu, J., and Huang, T. S. (2005). Boosting multiple classifiers constructed by hybrid discriminant analysis. In Oza, N. C., Polikar, R., Kittler, J., and Roli, F., editors, Multiple Classifier Systems, volume 3541 of Lecture Notes in Computer Science, pages 42-52, Seaside, CA, USA. Springer.
- Wu, Q., Mukherjee, S., and Liang, F. (2008). Localized sliced inverse regression. In Koller, D., Schuurmans, D., Bengio, Y., and Bottou, L., editors, NIPS, pages 1785-1792. MIT Press.

#### Paper Citation

#### in Harvard Style

Pardo-Aguilar C., F. Diez-Pastor J., García-Pedrajas N., J. Rodríguez J. and García-Osorio C. (2012). **LINEAR PROJECTION METHODS - An Experimental Study for Regression Problems** . In *Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,* ISBN 978-989-8425-98-0, pages 198-204. DOI: 10.5220/0003763301980204

#### in Bibtex Style

@conference{icpram12,

author={Carlos Pardo-Aguilar and José F. Diez-Pastor and Nicolás García-Pedrajas and Juan J. Rodríguez and César García-Osorio},

title={LINEAR PROJECTION METHODS - An Experimental Study for Regression Problems},

booktitle={Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},

year={2012},

pages={198-204},

publisher={SciTePress},

organization={INSTICC},

doi={10.5220/0003763301980204},

isbn={978-989-8425-98-0},

}

#### in EndNote Style

TY - CONF

JO - Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,

TI - LINEAR PROJECTION METHODS - An Experimental Study for Regression Problems

SN - 978-989-8425-98-0

AU - Pardo-Aguilar C.

AU - F. Diez-Pastor J.

AU - García-Pedrajas N.

AU - J. Rodríguez J.

AU - García-Osorio C.

PY - 2012

SP - 198

EP - 204

DO - 10.5220/0003763301980204