Here, we consider using the gaze distribution
when generating a random forest (Breiman, 2001). As
shown in (Rokach, 2016), a random forest consisting
of many decision trees can obtain a high classification
performance for various applications. The process of
generating the decision trees is based on randomness
according to the uniform distribution. In this paper,
we consider tuning the randomness according to the
gaze distribution instead of the uniform distribution.
To do this, we use a weighted random forest (Amara-
tunga et al., 2008; Winham et al., 2013; Maudes et al.,
2012). Amaratunga et al. assigned large weights to
the training samples contributing most to the classi-
fication performance. Winham et al. assigned large
weights to the votes in the decision trees that contri-
buted most to the classification performance. Howe-
ver, these existing methods are not easily altered to
include a gaze distribution because the methods did
not consider the positions of features in the pedestrian
images. Maudes et al. assigned random weights to fe-
atures and information gains when generating the de-
cision trees to increase noise tolerance. The features
and information gains are deeply relevant to positions
in pedestrian images. However, the existing method
simply used random weights and did not consider a
gaze distribution.
To this end, we hypothesize that the features and
information gains of the random forest depend on
a gaze distribution that considers the frequent gaze
locations of observers. We propose a method to
correctly classify gender by generating a weighted
random forest using a gaze distribution on training
samples that contain a background bias. To design
this novel method of generating a random forest,
we investigated the following alternatives: assigning
weights for feature selection, assigning weights for
feature values, and assigning weights for the informa-
tion gains. We evaluated the accuracy of the gender
classification using these alternatives on a publicly
available dataset. We confirmed that our method of
assigning to the information gains outperformed the
other methods.
2 BACKGROUND BIAS IN
TRAINING SAMPLES
Training samples collected for gender classification
may contain specific objects in the background sur-
rounding the pedestrians, thereby introducing a bias.
Here, we discuss a case whereby the training sam-
ples showing males contain a fence in the background
while the training samples showing females do not,
as shown in Figure 1. This case may be prevalent
Figure 1: Examples of training samples containing a bias
from the background surrounding the pedestrians.
when many females appear in the vicinity of a cer-
tain camera (e.g., near a cosmetics counter), and many
males appear in the vicinity of another camera (e.g.,
around a menswear section). In our preliminary ex-
periments, we observed that the accuracy of gender
classification declined when using training samples
containing a background bias (e.g., the presence or
absence of a fence). A random forest gender clas-
sifier included the background bias as discriminative
features rather than the true differences between the
physical appearances. For example, a test sample of a
female with a fence in front was incorrectly classified
as male. Avoiding this problem generally requires a
large number of training samples containing various
backgrounds for both genders. When the background
is obviously biased, we could modify the pedestrian
image collection strategy. In some cases, once the
sample collection is already complete, an unexpected
bias may be found in the training samples according
to the outputs of a gender classifier. Because the col-
lection of training samples is very time-consuming,
we may need to use the collected training samples
despite their bias. Therefore, our method aims to cor-
rectly classify gender using a weighted random fo-
rest incorporating a gaze distribution when the trai-
ning samples contain a background bias.
3 WEIGHTED RANDOM FOREST
USING A GAZE DISTRIBUTION
3.1 Overview of Decision Tree
Generation
The existing method for generating a random fo-
rest (Breiman, 2001) is as follows. Subsets of trai-
VISAPP 2019 - 14th International Conference on Computer Vision Theory and Applications
274