this description, it would be fairly easy for a teaching
professional to spot the deficiencies of the swing and
suggest drills to improves these areas. The purely data
driven model had a superior predictive performance,
compared to ReReM, but was mainly based on vari-
ables related to the ball flight. Consequently, further
analysis would be required to suggest exercises and
hence the model was deemed to be less interesting.
The purpose of the second case study was to cre-
ate a better decision support for coaching of truck
drivers. Here, ReReM was compared to a manual sub-
set modelling approach often used in practice. More
specifically nine subsets were created manually us-
ing domain knowledge and statistics, based on the
average speed and total weight of the trucks. When
restricted to the same constraints as the manual ap-
proach, ReReM could increase the predictive perfor-
mance slightly by creating more subsets. An impor-
tant point is that while the manual approach is very
time consuming for human experts - at least one man-
day was needed - the corresponding task could be per-
formed within a few minutes using ReReM.
The main advantage of ReReM was, however,
demonstrated when restrictions set by engineers was
enforced. Here, the same constraints as for the man-
ual approach applied, except that more variables were
considered. In this experiment ReReM, created mod-
els with significantly lower RMAE than the manual
approach, while still producing interesting models. In
addition, when compared to the purely data driven ap-
proach ReReM, actually had a slightly higher predic-
tive performance, while obtaining, in contrast to the
data driven approach, interesting models.
Finally, the complexity of the ReReM models was
slightly higher, i.e., the paths in the tree typically in-
cluded one or possibly two more conditions, but in
practice this would most likely be a small price to pay
for a more interesting model with high predictive per-
formance.
ACKNOWLEDGEMENTS
This work was supported by the Knowledge Founda-
tion through the project Big Data Analytics by Online
Ensemble Learning (20120192) and by Region V
¨
astra
(VGR) under grant RUN 612-0198-13, University of
Bor
˚
as and University of Sk
¨
ovde.
REFERENCES
Betzler, N. F., Monk, S. a., Wallace, E. S., and Otto, S. R.
(2012). Variability in clubhead presentation charac-
teristics and ball impact location for golfers’ drives.
Journal of Sports Sciences, 30(5):439–448.
Blake, C. L. and Merz, C. J. (1998). UCI Repository of
machine learning databases.
Broadie, M. (2008). Assessing Golfer Performance Using
Golfmetrics. Science and Golf V: Proceedings of the
2008 World Scientific Congress of Golf, (1968):253–
262.
Dietterich, T. (1996). Editorial. Machine Learning,
2(24):1–3.
Fradkin, A., Sherman, C., and Finch, C. (2004). How well
does club head speed correlate with golf handicaps?
Journal of Science and Medicine in Sport, 7(4):465–
472.
Freitas, A. (2002). A survey of evolutionary algorithms for
data mining and knowledge discovery. Advances in
Evolutionary Computation, pages 819–845.
Garofalakis, M., Hyun, D., Rastogi, R., and Shim, K.
(2003). Building decision trees with constraints. Data
Mining and Knowledge Discovery, 7(2):187–214.
Grbczewski, K. and Duch, W. (2002). Heterogeneous
Forests of Decision Trees. Artificial Neural Networks
(ICANN).
Iqbal, M. R. A., Rahman, S., Nabil, S. I., and Chowdhury,
I. U. A. (2012). Knowledge based decision tree con-
struction with feature importance domain knowledge.
2012 7th International Conference on Electrical and
Computer Engineering, pages 659–662.
Iqbal, R. A. (2011). Empirical Learning Aided by Weak
Domain Knowledge in the Form of Feature Impor-
tance. 2011 International Conference on Multimedia
and Signal Processing, pages 126–130.
Lomax, S. and Vadera, S. (2013). A survey of cost-sensitive
decision tree induction algorithms. ACM Computing
Surveys, 16(2).
Nijssen, S. and Fromont, E. (2010). Optimal constraint-
based decision tree induction from itemset lattices.
Data Mining and Knowledge Discovery, 21(1):9–51.
N
´
u
˜
nez, M. (1991). The use of background knowledge in
decision tree induction. Machine learning, 250:231–
250.
Quinlan, J. R. (1992). Learning with continuous classes.
In 5th Australian joint conference on artificial intelli-
gence, volume 92, pages 343–348.
Struyf, J. and Dzeroski, S. (2006). Constraint Based Induc-
tion of Multi-objective Regression Trees. 3933:222–
233.
Sweeney, M., Mills, P. M., Alderson, J., and Elliott, B. C.
(2013). The influence of club-head kinematics on
early ball flight characteristics in the golf drive. Sports
Biomechanics, 12(3):247–258.
Trackman (2015). TrackMan A/S.
KDIR 2015 - 7th International Conference on Knowledge Discovery and Information Retrieval
292