gio developed a distribution-based approach for FD which was the first accurate ap-
pearance-based method [4]. Training examples are gathered from creation of virtual
faces and bootstrapping. Each face and non-face is normalized using masking, illumi-
nation gradient correction and histogram equalization. All training patterns are
grouped into six face and six non-face clusters. Euclidean and normalized Mahalano-
bis distances are computed between an input image pattern and the prototype clusters.
Multilayer perceptron network is applied to classify face window patterns from non-
face patterns using the distances to each face and non-face cluster.
The first advanced neural network-based approach that reported results on a large
and difficult dataset was by Rowley et al. [5]. It becomes de-factor the standard for
evaluation with other upright frontal FD approaches. Their system incorporates face
knowledge in a retinally connected neural network, looking at windows of 20x20
pixels. In their single neural network implementation, there are two copies of a hid-
den layer with 26 units, where 4 units look at 10x10 pixel sub-regions, 16 look at 5x5
sub-regions, and 6 look at 20x5 pixels overlapping horizontal stripes. The input win-
dow is pre-processed like in the Sung and Poggio’s system [4]. The image is scanned
with a moving 20x20 window at every possible position and scale with a subsampling
factor of 1.2. To reduce the number of false alarms, they combine multiple neural
networks with an arbitration strategy. The fast version of FD system uses extra neural
network that scans an image with 30x30 pixels window and 10 pixels step for face
candidates which then are passing to the verification neural network.
A new extremely fast FD algorithm is presented by Viola and Jones [3] that uses
AdaBoost for selecting essential Haar-like features and the attention cascade of clas-
sifiers.
The state of the art methods [3, 5] still have some disadvantages. For example, FD
system which is based on [3] misses partially-occluded or hardly shadowed faces and
gives more false positive than in [5], whereas FD approach which is described in [5]
is too slow for real-time video-flow processing. In our paper we propose to combine
the abovementioned approaches to overcome these disadvantages by using some
Haar-like features from [3] for face candidate selection and improved FD neural net-
work-based method, adapted from [5]. We also used color segmentation preprocess-
ing stage with image color balance enhancement, skin detection in several color-
spaces and morphological operations for the FD process acceleration. After the pre-
processing stages the final FD is performed using improved face search strategy
across scale and position with the following key elements: inverse image scale pyra-
mid, adaptive window scanning step and window acceptance. These improvements in
search strategy allow reducing the number of handled windows especially in the case
of large faces presence. Training set for neural network is formed in bootstrap manner
not only for non-faces but also for faces. This provides to draw a distinction between
two classes more precisely.
The rest of this paper is organized as follows: first, we describe face candidate se-
lection algorithms which are based on skin color segmentation and Haar-like features’
analyzing, in section 3 the improved neural network-based method is described in
details and in the last section the conclusions and the future directions of our research
are given.
108