reproduction, and evolutionary science (Sun et al.,
2004). GA continues to prove itself successful in
many fields including object detection. There are
other optimization methods that serve well; however,
from some experiments, GA has proven to perform
better in solving problems. This might be due to the
advantages of GA and that they are probabilistic and
not deterministic, and have the ability to be better at
avoiding to be stuck at a local maxima and are par-
allelizable. Ferri et al. (1994), compared GA against
sequential search and their results clearly show that
GA performs better. Their work highlighted the point
of strength of GA which is the ability to perform the
search in a near optimal region due to the inherited
randomizations used in the search. Tabassum and
Mathew (2014), said that It was proved that genetic
algorithms are the most powerful unbiased optimiza-
tion techniques for sampling a large solution space.
After implementing the Knapsack problem and im-
age optimization, they concluded in their paper that
GA are the best application to solve various common
problems and that they are suitable for solving high
complexity problems like the combinatorial optimiza-
tions. Sun et al. (2004) provided another proof to the
strength of GA, when they used it to select the best
eigenvectors. In their work GA was used to solve the
problem of selecting the best feature set. They com-
pared their results with other techniques and proved
to provide better accuracy with less number of fea-
tures. Lillywhite et al. (2013) used Genetic algo-
rithms in constructing features which was used by
Adaboost to train a classifier. They tested their ap-
proach against previously published papers and used
the same dataset for comparison. Their technique
proved to be significantly more accurate than most of
the previous work they compared against. Some re-
searchers used GA in feature selection. Feature selec-
tion methods can be divided into 3 main categories:
wrappers, filters and embedded methods. Filters are
a form of preselecting each feature on its own with-
out considering the previous predictor. Wrappers are
methods to score the predictive power of a subset of
features by using a machine learning technique as a
black box, while the embedded are methods that inte-
grate classification and learning into a single process
(Chouaib et al., 2008; Xue et al., 2015) . In Their
work, Chouaib et al. (2008) aimed to find the set of
the most representative features using GAs, in order
to decrease the detection time. Their results showed
that for the majority of descriptors their feature set
was significantly reduced up to 75% of the original
set in two class problems. Dezhen and Kai (2008)
provided a post optimization technique to avoid the
redundancy of classifiers. By doing so, they managed
to increase the speed of classification by 110% due to
reducing the number of features to 55% of the original
set. Since this is a post optimization process, it can be
considered as an added part to the training process,
which will be an overhead on the training time. Xue
et al. (2015) provided a survey on the use of evolu-
tionary computing in feature selection. In their work
they surveyed more than 40 papers which use GA in
feature selection.
Object detection is a main area of research in com-
puter vision. It falls under the type of problems that
suffer from a time consuming training process, due
to the huge search space involved. Viola and Jones
(2001) devised a new face detector using Haar fea-
tures, since features provide a set of comprehensive
information that can be learned by machine learning
algorithms. They also reduce the in-out class vari-
ability compared to that of the raw pixels (Lillywhite
et al., 2013; Viola and Jones, 2001). Haar features
are mainly rectangles divided into black and white re-
gions and the value of this feature is calculated by
subtracting the sum of pixels in the white region from
the sum of those in the black region (Viola and Jones,
2001). For each image, variations of each of the four
Haar feature types are computed in all possible sizes
and all possible locations, which provides a huge set
of features.
The authors chose Adaboost as a method to obtain
their strong classifier. Adaboost was proposed by Fre-
und and Schapire (1995), it has the power to search
through the features and select those of good perfor-
mance then combine them to create a strong classifier.
The general idea of the algorithm works as follows:
For a number of iterations T:
• Pass through the set of all possible features and
calculate the error of each one on the given im-
ages.
• Choose the best feature (the one with the lowest
error) as the first weak classifier.
• Update the sample images corresponding weights,
by putting more weights on the wrongly classified
images.
• It then goes through the next iteration, until it
finds the set of best features, to be used in clas-
sification.
One of the important contributions of Viola and
Jones (2001) work is the cascade classifier which in-
creased the accuracy while radically reducing the time
consumed in detection. The cascade classifier is a
stage classifier where the thresholds vary. The first
stages has a low threshold, thus detecting all the true
positives while eliminating the strong negatives, be-
fore more complex classifiers are used to achieve less
GAdaBoost: Accelerating Adaboost Feature Selection with Genetic Algorithms
157