Boosted Random Forest
Yohei Mishina, Masamitsu Tsuchiya and Hironobu Fujiyoshi
Department of Computer Science, Chubu University, 1200 Matsumoto-cho, Kasugai, Aichi, Japan
Keywords:
Boosting, Random Forest, Machine Learning, Pattern Recognition.
Abstract:
The ability of generalization by random forests is higher than that by other multi-class classifiers because of
the effect of bagging and feature selection. Since random forests based on ensemble learning requires a lot of
decision trees to obtain high performance, it is not suitable for implementing the algorithm on the small-scale
hardware such as embedded system. In this paper, we propose a boosted random forests in which boosting
algorithm is introduced into random forests. Experimental results show that the proposed method, which
consists of fewer decision trees, has higher generalization ability comparing to the conventional method.
1 INTRODUCTION
Random forest(Breiman, 2001) is a multi-class clas-
sifier that is robust against noise, has high discrimina-
tion performance, and is capable of training and clas-
sifying at high speed. It is therefore attracting atten-
tion in manyfields, including computer vision, pattern
recognition, and machine learning(Amit and Geman,
1997), (Lepetit and p. Fua, 2006), (J. Shotton and
Cipolla, 2008), (J. Shotton et al., 2011), (Gall et al.,
2011). Random forest controls loss of generalization
in training due to overfitting by introducing random-
ness in bagging and feature selection(Ho, 1998) when
constructing an ensemble of decision trees. Each
decision tree in random forest is independent, so
high speed can be attained by parallel processing in
tree training and classification. Boosting(Freund and
Schapire, 1995) is a typical ensemble training algo-
rithm that is used to sequentially construct classifiers
for random forest that involve independent decision
trees. Boosting is an ensemble training algorithm that
combines weak learners, which individually have low
discriminating performance, to construct a classifier
of higher discriminating performance. Boosting gen-
erally attains high discrimination by sequential train-
ing of classifiers in which the training sample for
which the previous classifier produced classification
errors is used to train the subsequent classifier to pro-
duce correct classification. However, boosting tends
to overfit the training sample. Random forest, on the
other hand, uses randomness in the construction of the
decision trees, thus avoiding overfitting to the training
sample. For that reason, a large number of decision
trees must be constructed to obtain high generality.
However, increasing the number of decision trees in-
creases the memory requirements, so that approach is
not suited to implementation on small-scale hardware
such as embedded system. We therefore propose a
boosted random forest method, in which a boosting
algorithm is introduced in random forest. The pro-
posed method constructs complementary classifiers
by successive decision tree construction and can yield
classifiers with smaller decision trees while maintain-
ing discrimination performance.
2 RANDOM FOREST
Random forest is an ensemble training algorithm that
constructs multiple decision trees. It suppresses over-
fitting to the training samples by random selection
of training samples for tree construction in the same
way as is done in bagging(Breiman, 1996),(Breiman,
1999), resulting in construction of a classifier that is
robust against noise. Also, random selection of fea-
tures to be used at splitting nodes enables fast train-
ing, even if the dimensionality of the feature vector is
large.
2.1 Training Process
In the training of random forest, bagging is used to
create sample sub sets by random sampling from the
training sample. One sample set is used to construct
one decision tree. At splitting node n, sample set S
n
is split into sample sets S
l
and S
r
by comparing the
594
Mishina Y., Tsuchiya M. and Fujiyoshi H..
Boosted Random Forest.
DOI: 10.5220/0004739005940598
In Proceedings of the 9th International Conference on Computer Vision Theory and Applications (VISAPP-2014), pages 594-598
ISBN: 978-989-758-004-8
Copyright
c
2014 SCITEPRESS (Science and Technology Publications, Lda.)