FACE DETECTION ALGORITHM USING EMD
Application to Biometric Recognition System
Jordi Solé-Casals
Digital Technologies Group, University of Vic, Vic, Spain
Carlos M. Travieso-González, Marcos del Pozo and Jesús B. Alonso
Signals and Communications Department, Institute for Technological Development and Innovation
on Communications (IDETIC), University of Las Palmas de Gran Canaria, Las Palmas de GC., Spain
Keywords: Face detection, Empirical Mode Decomposition (EMD), Biometrics.
Abstract: In this work we present a new face detection algorithm for biometric recognition systems, based on
Empirical Mode Decomposition (EMD) that helps us to detect the central skin part of the face. After a
binarization and dilatation process, we get an approximation of the face location that is finally established
by adjusting an ellipse. Finally, a mask is created using the perimeter of this ellipse and applied to the
original image, giving as a result the detected face. Experiments show the performance of the method, with
a recognition rate of 97.60%.
1 INTRODUCTION
Face detector is an important step towards biometric
face recognition. If we have a person in an image
and we want to recognise it, the first step is to detect
the face in order to apply then a biometric system.
Of course, the most interesting possibility is to do it
automatically, without any manual step. This can be
a requirement in real time systems, or when we have
an important amount of pictures to analyze.
Different work has been done in this area. Some
references are showed in the next paragraphs.
In (Yongqiu et al., 2010) a combination of three
classifiers is used: skin color detector, AdaBoost
detector based on haar-Iike features, and eye-mouth
detector. A semi-serial architecture is designed to
combine the three detectors. They selected 103
pieces of color images including faces, and 100
pieces of color images of background from their
own database. Success rate was 85.5%.
The (Qiang-rong and Hua-lan, 2010) paper
proposes a novel face detection algorithms based on
combining skin color model, edge information and
features of human eyes in color image The size of
the database was 544 samples, reaching a
recognition rate up to 94.9%.
In (Venkatesh and Marcel, 2010) authors use an
additional classifier that predicts the bounding box
of a face within a local search area. Then a face/non-
face classifier is used to verify the presence or
absence of a face. The database is mixed from
CMU+MIT and Fleuret face databases, with a total
amount of 375 images. Authors have improved
between 15-30% in detection rate or speed when
compared to the standard scanning technique.
On the other hand in (You-jia and Jian-wei,
2010) authors propose a rotation invariant multi-
view color face detection method combining skin
color segmentation and multi-view AdaBoost
algorithm. The database is composed by 153 internet
images. Authors reached, with a simple background,
98.04% of recognition rate; and a smaller value of
82.35% with complex background.
In (Zhao et al., 2010) authors present a
combination of Contrast-limited Adaptive
Histogram Equalization (CLAHE) and multi-step
integral projection. The JAFFE database was used
and they obtained a recognition rate of 95.318%.
(Yihu et al., 2010) have designed a novel scheme
integrating skin color segmentation and facial
component localization to detect face. They have
used 214 PC camera images, and reached a 95.8% of
detection rate, and for 218 Web images, a 93.5% of
565
Solé-Casals J., Travieso-González C., del Pozo M. and Alonso J..
FACE DETECTION ALGORITHM USING EMD - Application to Biometric Recognition System.
DOI: 10.5220/0003297305650569
In Proceedings of the International Conference on Bio-inspired Systems and Signal Processing (MPBS-2011), pages 565-569
ISBN: 978-989-8425-35-5
Copyright
c
2011 SCITEPRESS (Science and Technology Publications, Lda.)
detection rate.
In (Zhang and Shi, 2008) authors introduce a
new nonlinear transformation method of the YCbCr
color skin space, which is the color segmentation of
the human face for regional analysis and extraction.
They implemented image rotation and template
matching. The database is composed by 400 images.
The success rate for Face Average Brightness was
92%.
Face detection method for the color image with
complex background is presented in (Xue-wu et al.,
2009), which is a mixed skin-color segmentation
model in both YCbCr and HIS color space
constructed. Authors used 95 pictures with different
brightness for testing. Result shows a success
detecting rate over 91.7%.
In our work, the proposed algorithm can be
summarized by the following block diagram (see
Figure 1):
On each image
()
tx we apply Empirical Mode
Decomposition (EMD) algorithm in order to
decompose the image to its Intrinsic Mode Functions
(IMFs). First 50% of the modes are combined
together in order to generate a new image
(
)
ty that
contains basically the different regions present into
the original image.
Afterwards, we apply a binarization process
followed by a dilatation, in order to emphasize the
central part of the skin of the face against all the
other parts of the image. The result is image
(
)
tz .
Finally, we detect the region of interest (ROI)
and we fit to it an ellipse in order to obtain face’s
borders (image
()
tw ) and generate a mask that
indicates which pixels belongs to the face (with
value equals to 1) and which not (with value equals
to 0).
The output image
(
)
ts is then generated by
applying this mask to the original image.
This paper is organized as follows: After this
introduction, EMD technique is presented in Section
2 with details on how to use it in our case; Section 3
is devoted to the image processing step, whereas
ROI procedure and boundary calculations are
explained in Section 4. Experiments and results are
shown in Section 5; and finally conclusions are
presented in Section 6.
2 EMPIRICAL MODE
DECOMPOSITION
The first block of the proposed method (see Figure
1) consists in applying the Empirical Mode
Decomposition (EMD) technique to the input
image
(
)
tx .
EMD is a relatively new data analysis method
that decomposes the signal into waveforms
modulated in both amplitude and frequency by
extracting all of the oscillatory modes embedded in
the signal.
EMD
Image
Processing
ROI
detection
BBDD
x
(
t
)
y(t)
z(t)
w(t)
s(t)
Figure 1: Block diagram of the proposed method.
2.1 Obtaining Intrinsic Mode
Functions
As explained in (N. Huang, et al., 1998), the
decomposition can be viewed as an expansion of the
data in terms of the so called Intrinsic Mode
Functions (IMF’s) that are the waveforms extracted
by EMD. Each IMF is symmetric and it is assumed
to yield a meaningful local frequency traces.
Different IMF’s do not exhibit the same frequency at
the same time. Then, these IMF’s, based on and
derived from the data, can serve as the basis of that
expansion which can be linear or nonlinear as
dictated by the data, and it is complete and almost
orthogonal.
The decomposition is an intuitive and adaptive
signal-dependent decomposition and does not
BIOSIGNALS 2011 - International Conference on Bio-inspired Systems and Signal Processing
566
require any conditions about the stationarity and
linearity of the signal.
IMF in each cycle is defined by the zero
crossings. Every IMF involves only one mode of
oscillation, no complex riding waves are thus
allowed. Notice that the IMF is not limited to be a
narrow band signal, as it would be in traditional
Fourier or wavelets decomposition. In fact, it can be
both amplitude and frequency modulated at once,
and also non-stationary or non-linear.
The process of IMF extraction, from a signal
()
tx , known as sifting process (N. Huang, et al.,
1998)
, is based on the following steps:
1. Determine the local maxima and minima of the
analyzed signal
()
tx .
2. Generate the upper and lower signal envelopes
by connecting those local maxima and minima
respectively by the chosen interpolation method
(e.g., linear, spline, cubic spline, piece-wise
spline).
3. Determine the local mean
()
tm by averaging
the upper and lower signal envelopes.
4. Subtract the local mean from the data:
() () ()
tmtxth
11
= .
5. Test if
()
th
1
is an IMF.
If yes, stop the procedure. The first IMF
labelled as
()
tc
1
is already obtained.
If not, replace
(
)
tx with
()
th
1
and repeat
the procedure from step 1.
In order to obtain the second IMF, one applies
the sifting process to the residue
()
(
)
(
)
tctxtr
11
= ,
obtained by subtracting the first IMF from
(
)
tx ; the
third IMF is in turn extracted from the residue
() () ()
tctrtr
212
= and so on.
One stops extracting IMF’s when two
consecutive sifting results are close to identical.
Then, the empirical mode decomposition of the
signal
()
tx may be written as:
()
=
+=
n
i
ni
rctx
1
(1)
Thus, we obtain a decomposition of the data into
n-empirical modes, and a residue,
n
r , which can be
either the mean trend or a constant.
2.2 Applying EMD Procedure
Decomposing the input image
()
tx whit EMD
procedure explained before will result in a
decomposition of the signal in a set of IMF’s plus a
residue, as shown in equation 1.
As the first modes capture high frequencies of
the signal, we will generate a new signal
()
ty (pre-
processed image) by taking into account only the
first 50% of the modes. This pre-processed image is
obtained by adding the considered mode:
()
=
=
2/
1
n
i
i
cty
(2)
3 IMAGE PROCESSING
The image processing block uses classical image
processing techniques that are necessary in order to
facilitate the detection of the Region Of Interest
(ROI) that will be detailed in the next section.
Using the image
(
)
ty obtained from EMD step,
we apply a binarization procedure in order to have
only two different values for the pixels.
Homogenous region of the face will basically be in
white color, while most of the rest of the image will
be in black color. Edges are easily detected after
binarization, as for example those due to eyes, nose,
mouth, hair, etc.
In order to group the possible unconnected
regions of the face, and taking into account that our
goal is to detect the position of the face into the
image, we apply a dilatation process with a rectangle
of size 1x2 as structural element. This step is very
useful as very often eyes, nose and mouth breaks the
face into different unconnected regions. The output
signal of this step will be black and white image
(
)
tz (see figure 1).
4 ROI DETECTION
The goal of this last block is to determine which part
of the image is the face and which part is not.
In order to implement the ROI detector, we label
all the connected regions of the image
()
tz and we
analyze the different obtained regions:
The biggest region is due to the background part
of the image, as in all the images the face is always
smaller than the rest of the pixels.
The second biggest region is habitually the
region that we are looking for, as the face (especially
due to the dilatation process) is an important part of
the image. But sometimes it may happen that the
frontal region (upper part), the neck (lower part), or
FACE DETECTION ALGORITHM USING EMD - Application to Biometric Recognition System
567
other parts of the image are larger than the region of
the face itself (central region).
In order to avoid a mistake selecting the ROI, we
propose the following procedure:
1. Sort of all the regions by the total number of
pixels that contains;
2. Select the second bigger region (the first one
is omitted as it corresponds to background);
3. Calculate the mass centre of the region;
4. If the mass centre is located near the centre
of the image, take this region as ROI and
stop the procedure;
5. If not, select the next important region and
go to point 3 until a ROI is detected.
Once we have a ROI, the next step is to fit an
ellipse that will define the portion of the image
considered as face and the rest of the image. In order
to calculate the axes and centre of the ellipse, we
obtain the coordinates of the minimum and
maximum of rows and columns of the ROI, which
will be used in order to calculate the centre and the
axes of the ellipse. Using them we get the borders of
the ellipse that fits the ROI (see
()
tw in Figure 1),
and we create a mask with the aim of extracting the
face of the image by an AND operation (pixel by
pixel) between the original image
()
tx and the
mask. The resultant image is the output of the signal,
labelled as
()
ts in Figure 1.
5 EXPERIMENTS
In order to test our algorithm we apply the proposed
method to the entire JAFFE database that contains
213 different images.
As our method is not based on any learning
paradigm, we just take (randomly) 5 images for
adjusting some few parameters of the method
(number of considered modes and threshold for
binarization process), keeping the rest of the images
(208) for testing it.
Figure 2: Up: original faces from JAFFE database. Down:
Correct face detection obtained using the proposed
method.
Figure 3: Up: original faces from JAFFE database. Down:
Bad face detection obtained using the proposed method.
Considering that the system works correctly if
eyes, nose and mouth falls into the detected face,
and works improperly if some (or all) of these
elements fall out of the ROI, we get 203 good
detections and 5 bad detections. Calculating the
quotient between good detections and total number
of available images (excluding the images used for
training process), we obtain a recognition rate of
97.60%.
In order to show different obtained results, we
present some of the correctly detected faces in figure
2, and incorrectly detected faces in figure 3.
As we can see in these figures, the method fails
sometimes and takes as face a wrong region. The
principal problem is due to the disconnected regions
that can appear after EMD. This is why this point
must be developed in more detail as is the key factor
in order to obtain good results.
On the other hand, different illumination or
background conditions may affect results. More
research must be done for these cases in order to get
a general face detector system.
We can compare our method versus the method
proposed in (Zhao et al., 2010) because in both cases
the same database (JAFFE database) is used. Our
proposal improves over 2% the detection face rate.
Therefore our method shows better proprieties
versus (Zhao et al., 2010) proposal.
6 CONCLUSIONS
The proposed method for face detection presented in
this work is based in EMD technique, and uses very
popular and simple image tools. Using EMD, the
central part of the skin on the face is easily
identified, and then a binarization and dilatation
process are applied in order to localize the face on
the image. Finally, an ellipse is fitted and a mask is
BIOSIGNALS 2011 - International Conference on Bio-inspired Systems and Signal Processing
568
created to be used on the original image for
extracting the face.
Some of the aspects of the algorithm can be
explored in more detail. For example, the number of
considered modes (IMFs) is an important and critical
parameter. More experiments, with other databases,
must be done in order to determine the best number
of modes, in a way that this number is independent
of the image (database).
Also in this preliminary work we have not tested
other possible structural elements for the dilatation
step, but it is an interesting point to investigate in
future works, as the success of the results depends
on the capability of this step. If we can merge the
different parts of the face in a sole region, the
procedure works properly; but if the face is split in
many different parts, the systems fails and face is
not correctly detected.
On the other hand, we have not explored yet the
possibility of having more than one face in the
image, but this is another interesting point to
investigate in detail.
Finally, future work must be done in order to test
the procedure in other situations, as for example in
bad illumination conditions, dark or shadow places,
brilliant places, etc. Of course, these scenarios,
where illumination can be very different, are not
easy, but they are realistic and must be taken in
consideration for real world applications.
We are already conducting investigations
following these points, and preliminary results for
some cases are very promising.
ACKNOWLEDGEMENTS
This work has been partially supported by the
University of Vic under a mobility grant for Dr.
Jordi Solé-Casals, by “Cátedra Telefónica ULPGC
2009-10”, and by the Spanish Government under
funds from MCINN TEC2009-14123-C04-01.
REFERENCES
Huang N., Shen Z., Long S., Wu M., Shih H., Zheng Q.,
Yen N. C., Tung C., and Liu H., 1998. The empirical
mode decomposition and the Hilbert spectrum for
nonlinear and non-stationary time series analysis.
Proc. of the Royal Society A, vol. 454, no. 1971, pp.
903–995,
Qiang-rong J., Hua-lan, L., 2010. Robust human face
detection in complicated color images. The 2nd IEEE
International Conference on Information Management
and Engineering, pp.218-221
Venkatesh, B. S., Marcel, S., 2010. An alternative
scanning strategy to detect faces. IEEE International
Conference on Acoustics, Speech and Signal
Processing. pp. 2122-2125
Xue-wu, Z., Ling-yan, L., Dun-qin, D., Wei-liang, X.,
2009. A novel method of face detection based on
fusing YCbCr and HIS color space. IEEE
International Conference on Automation and
Logistics, pp.831-835
Yihu, Y., Daokui, Q., Fang, X., 2010. Face detection
method based on skin color segmentation and facial
component localization. 2nd International Asia
Conference on Informatics in Control, Automation and
Robotics, vol.1, pp.64-67
Yongqiu, T., Faling Y. Guohua, C., Shizhong, J.,
Zhanpeng H, 2010. Fast rotation invariant face
detection in color image using multi-classifier
combination method. International Conference on E-
Health Networking, Digital Ecosystems and
Technologies, pp.211-218
You-jia,F, Jian-wei, L. 2010. Rotation Invariant Multi-
View Color Face Detection Based on Skin Color and
Adaboost Algorithm. International Conference on
Biomedical Engineering and Computer Science, pp.1-
5
Zhang, Z., Shi, Y., 2008. Face Detection Method Based on
a New Nonlinear Transformation of Color Spaces.
Fifth International Conference on Fuzzy Systems and
Knowledge Discovery, 2008. vol.4, pp.34-38
Zhao, Y., Georganas, N. D., Petriu, E. M., 2010. Applying
Contrast-limited Adaptive Histogram Equalization and
integral projection for facial feature enhancement and
detection. Instrumentation and Measurement
Technology Conference, pp.861-866.
FACE DETECTION ALGORITHM USING EMD - Application to Biometric Recognition System
569