
gio developed a distribution-based approach for FD which was the first accurate ap-
pearance-based method [4]. Training examples are gathered from creation of virtual 
faces and bootstrapping. Each face and non-face is normalized using masking, illumi-
nation gradient correction and histogram equalization. All training patterns are 
grouped into six face and six non-face clusters. Euclidean and normalized Mahalano-
bis distances are computed between an input image pattern and the prototype clusters. 
Multilayer perceptron network is applied to classify face window patterns from non-
face patterns using the distances to each face and non-face cluster. 
The first advanced neural network-based approach that reported results on a large 
and difficult dataset was by Rowley et al. [5]. It becomes de-factor the standard for 
evaluation with other upright frontal FD approaches. Their system incorporates face 
knowledge in a retinally connected neural network, looking at windows of 20x20 
pixels. In their single neural network implementation, there are two copies of a hid-
den layer with 26 units, where 4 units look at 10x10 pixel sub-regions, 16 look at 5x5 
sub-regions, and 6 look at 20x5 pixels overlapping horizontal stripes. The input win-
dow is pre-processed like in the Sung and Poggio’s system [4]. The image is scanned 
with a moving 20x20 window at every possible position and scale with a subsampling 
factor of 1.2. To reduce the number of false alarms, they combine multiple neural 
networks with an arbitration strategy. The fast version of FD system uses extra neural 
network that scans an image with 30x30 pixels window and 10 pixels step for face 
candidates which then are passing to the verification neural network.  
A new extremely fast FD algorithm is presented by Viola and Jones [3] that uses 
AdaBoost for selecting essential Haar-like features and the attention cascade of clas-
sifiers. 
The state of the art methods [3, 5] still have some disadvantages. For example, FD 
system which is based on [3] misses partially-occluded or hardly shadowed faces and 
gives more false positive than in [5], whereas FD approach which is described in [5] 
is too slow for real-time video-flow processing. In our paper we propose to combine 
the abovementioned approaches to overcome these disadvantages by using some 
Haar-like features from [3] for face candidate selection and improved FD neural net-
work-based method, adapted from [5]. We also used color segmentation preprocess-
ing stage with image color balance enhancement, skin detection in several color-
spaces and morphological operations for the FD process acceleration. After the pre-
processing stages the final FD is performed using improved face search strategy 
across scale and position with the following key elements: inverse image scale pyra-
mid, adaptive window scanning step and window acceptance. These improvements in 
search strategy allow reducing the number of handled windows especially in the case 
of large faces presence. Training set for neural network is formed in bootstrap manner 
not only for non-faces but also for faces. This provides to draw a distinction between 
two classes more precisely. 
The rest of this paper is organized as follows: first, we describe face candidate se-
lection algorithms which are based on skin color segmentation and Haar-like features’ 
analyzing, in section 3 the improved neural network-based method is described in 
details and in the last section the conclusions and the future directions of our research 
are given. 
108