Interesting Regression- and Model Trees Through Variable Restrictions
Rikard König, Ulf Johansson, Ann Lindqvist, Peter Brattberg
2015
Abstract
The overall purpose of this paper is to suggest a new technique for creating interesting regression- and model trees. Interesting models are here defined as models that fulfill some domain dependent restriction of how variables can be used in the models. The suggested technique, named ReReM, is an extension of M5 which can enforce variable constraints while creating regression and model trees. To evaluate ReReM, two case studies were conducted where the first concerned modeling of golf player skill, and the second modeling of fuel consumption in trucks. Both case studies had variable constraints, defined by domain experts, that should be fulfilled for models to be deemed interesting. When used for modeling golf player skill, ReReM created regression trees that were slightly less accurate than M5’s regression trees. However, the models created with ReReM were deemed to be interesting by a golf teaching professional while the M5 models were not. In the second case study, ReReM was evaluated against M5’s model trees and a semi-automated approach often used in the automotive industry. Here, experiments showed that ReReM could achieve a predictive performance comparable to M5 and clearly better than a semi-automated approach, while fulfilling the constraints regarding interesting models.
References
- Betzler, N. F., Monk, S. a., Wallace, E. S., and Otto, S. R. (2012). Variability in clubhead presentation characteristics and ball impact location for golfers' drives.
- Journal of Sports Sciences, 30(5):439-448.
- Blake, C. L. and Merz, C. J. (1998). UCI Repository of machine learning databases.
- Broadie, M. (2008). Assessing Golfer Performance Using Golfmetrics. Science and Golf V: Proceedings of the 2008 World Scientific Congress of Golf , (1968):253- 262.
- Dietterich, T. (1996). 2(24):1-3.
- Fradkin, A., Sherman, C., and Finch, C. (2004). How well does club head speed correlate with golf handicaps? Journal of Science and Medicine in Sport, 7(4):465- 472.
- Freitas, A. (2002). A survey of evolutionary algorithms for data mining and knowledge discovery. Advances in Evolutionary Computation, pages 819-845.
- Garofalakis, M., Hyun, D., Rastogi, R., and Shim, K. (2003). Building decision trees with constraints. Data Mining and Knowledge Discovery, 7(2):187-214.
- Grbczewski, K. and Duch, W. (2002). Heterogeneous Forests of Decision Trees. Artificial Neural Networks (ICANN).
- Iqbal, M. R. A., Rahman, S., Nabil, S. I., and Chowdhury, I. U. A. (2012). Knowledge based decision tree construction with feature importance domain knowledge. 2012 7th International Conference on Electrical and Computer Engineering, pages 659-662.
- Iqbal, R. A. (2011). Empirical Learning Aided by Weak Domain Knowledge in the Form of Feature Importance. 2011 International Conference on Multimedia and Signal Processing, pages 126-130.
- Lomax, S. and Vadera, S. (2013). A survey of cost-sensitive decision tree induction algorithms. ACM Computing Surveys, 16(2).
- Nijssen, S. and Fromont, E. (2010). Optimal constraintbased decision tree induction from itemset lattices. Data Mining and Knowledge Discovery, 21(1):9-51.
- Nún˜ez, M. (1991). The use of background knowledge in decision tree induction. Machine learning, 250:231- 250.
- Quinlan, J. R. (1992). Learning with continuous classes. In 5th Australian joint conference on artificial intelligence, volume 92, pages 343-348.
- Struyf, J. and Dzeroski, S. (2006). Constraint Based Induction of Multi-objective Regression Trees. 3933:222- 233.
- Sweeney, M., Mills, P. M., Alderson, J., and Elliott, B. C. (2013). The influence of club-head kinematics on early ball flight characteristics in the golf drive. Sports Biomechanics, 12(3):247-258.
- Trackman (2015). TrackMan A/S.
Paper Citation
in Harvard Style
König R., Johansson U., Lindqvist A. and Brattberg P. (2015). Interesting Regression- and Model Trees Through Variable Restrictions . In Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015) ISBN 978-989-758-158-8, pages 281-292. DOI: 10.5220/0005600302810292
in Bibtex Style
@conference{kdir15,
author={Rikard König and Ulf Johansson and Ann Lindqvist and Peter Brattberg},
title={Interesting Regression- and Model Trees Through Variable Restrictions},
booktitle={Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)},
year={2015},
pages={281-292},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0005600302810292},
isbn={978-989-758-158-8},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, (IC3K 2015)
TI - Interesting Regression- and Model Trees Through Variable Restrictions
SN - 978-989-758-158-8
AU - König R.
AU - Johansson U.
AU - Lindqvist A.
AU - Brattberg P.
PY - 2015
SP - 281
EP - 292
DO - 10.5220/0005600302810292