SELF ORGANIZING NEURAL NETWORK APPLICATION FOR
SKIN COLOR SEGMENTATION
González-Ortega, F. J. Díaz-Pernas, M. Antón-Rodríguez, M. Martínez-Zarzuela
I. de la Torre-Díez, J. F. Díez-Higuera and D. Boto-Giralda
Department of Signal Theory, Communications and Telematics Engineering
Telecommunications Engineering School, University of Valladolid, Valladolid, Spain
Keywords: Fuzzy ART Neural Network, Skin Color Filter, TSL Color Space.
Abstract: In this paper, we present a Fuzzy ART (Adaptive Resonance Theory) neural network application for skin
color segmentation using the chromaticity components of the TSL color space. The Fuzzy ART networks
deal with the stability-plasticity dilemma and they can be applied to color image segmentation, particularly
to skin color segmentation. The developed application has three modes: parameter setting, skin color filter
creation, and skin color filter performance. Many parameters can be tuned to create proper skin color filters
from manually selected skin regions in an image. A skin color filter is a LUT (Look-Up Table) that gives
each color in the RGB color space, one of two different outputs, skin or non-skin color. The performance of
different skin color filters can be compared with the application. A skin color filter can be used to make
robust real-time skin color segmentation in video sequences captured by a webcam.
1 INTRODUCTION
Many computer vision applications such as
surveillance and access control systems, person
identification and Human-Computer Interfaces
(HCI) require the detection, localization, and
tracking of human body parts. These tasks in a
complex computer vision application stress the
importance of a low processing time as real-time
working is usually a constraint in the global
application. Moreover, reliable computer vision
applications need robustness to changes in
illumination, background, and objects of interest.
In this context, color is widely used for the
detection and tracking of human body parts as it is
robust to geometric changes and fast to calculate.
Phung et al. (2005) presented an analysis of three
important issues for the skin color segmentation:
color representation, color quantization, and
classification algorithm. Schmugge et al. (2007)
studied the performance of five color space
transformations with and without the luminosity
component for skin segmentation.
In this paper, we present a novel application to
make skin color filters from the chromaticity
components of the TSL color space of the pixels
within manually selected skin regions in an image. A
skin color filter is a LUT so that every color in the
RGB color space is tagged as skin or non-skin color.
The application to create the LUT is based on a
Fuzzy ART neural network (Carpenter et al., 1991).
This is a self organizing neural network
characterized for being stable and plastic and it is
suitable for color image segmentation. The created
skin color filter can be used to make skin color
segmentation in video sequences captured by an
inexpensive Universal Serial Bus camera robustly to
changes in illumination and motion of the human
body parts in an unrestricted environment.
The rest of the paper is organized as follows. In
Section 2, we present the literature regarding skin
color filtering and the TSL color space, whose
chromaticity components are used to characterized
skin regions in images. Section 3 deals with the
Fuzzy ART neural network and the reasons why we
use it to classify skin colors. Next, the skin color
filter application is presented in Section 4 along with
its three working modes: parameter setting, skin
color filter creation, and skin color filter
performance. Finally, Section 5 draws the
conclusions about the developed skin color
application.
425
González-Ortega D., J. Díaz-Pernas F., Antón-Rodríguez M., Martínez-Zarzuela M., de la Torre-Díez I., F. Díez-Higuera J. and Boto-Giralda D..
SELF ORGANIZING NEURAL NETWORK APPLICATION FOR SKIN COLOR SEGMENTATION.
DOI: 10.5220/0003083304250428
In Proceedings of the International Conference on Fuzzy Computation and 2nd International Conference on Neural Computation (ICNC-2010), pages
425-428
ISBN: 978-989-8425-32-4
Copyright
c
2010 SCITEPRESS (Science and Technology Publications, Lda.)
2 SKIN COLOR SEGMENTATION
Color filtering is a powerful tool in computer vision
tasks. It is a low level feature, highly discriminative,
computationally fast, and robust to geometric
changes that can be applied in the definition of
human body parts. Many studies evaluating color
spaces for skin detection have been carried out
(Kakumanu et al., 2007; Phung et al., 2005). Color
can be decomposed into three different components,
one luminosity and two chromaticity components.
Although skin color can notably change from some
human being to other or even in the same human
being due to factors such as a suntan, a blush, etc.,
several researches have proved that skin colors have
a certain invariance regarding chromaticity
components, although skin colors belong to people
of different ethnic groups (Fu Jie Huang and Tsuhan
Chen, 2000). Other factors such as lighting or skin
tone affect mainly the luminosity component.
Different color spaces separating luminosity and
chromaticity components have been used for skin
color segmentation (YIQ, CIE-LAB, CIE-LUV,
HSV, HIS, and TSL) (Kakumanu et al., 2007; Phung
et al., 2005). TSL has been selected as the best color
space to extract skin color from complex
backgrounds (Chen and Liu, 2003) because it has the
advantage of extracting a given color robustly while
minimizing illumination influence.
In our application, the chromaticity components
of the TSL color space are used to create skin color
filter.
3 FUZZY ART NEURAL
NETWORKS
The ART neural network architecture is a self-
organizing network that allows to switch between a
learning or plastic state (in which the network
parameters may be modified) and a stable or fixed
state for operation (Carpenter and Grossberg, 1987).
Fuzzy ART is an extension of the ART1 system
(Carpenter et al., 1991) that allows analog inputs
with values between 0 and 1. The most important
point in the application of an ART network to color
image segmentation, and especially skin color
segmentation, is the stability-plasticity dilemma. The
ART network has plasticity in order to learn new
information, while it has stability in order not to
forget the already learned information. ART
networks are able to obtain stability without
sacrificing plasticity (Carpenter and Grossberg,
1987).
In our application, a Fuzzy ART neural network
is used to categorize each color as skin or non-skin.
4 SKIN COLOR FILTER
APPLICATION
We have developed a skin color filter application so
that a filter, which classifies every color as being
skin color or not, is created from skin regions of an
image manually selected by the user. A Fuzzy ART
neural network is used to create the skin color filter.
The inputs to the neural network are the
chromaticity components (T and S) of the TSL color
space of all the pixels inside the skin region of an
image. The TSL color space was selected as it
achieved the best skin color filtering performance in
an experimental study including the normalized rgb,
YIQ, HIS, CIELUV, and TSL space (González-
Ortega et al., 2010).
The application has been developed using the
Intel Open Source Computer Vision Library (Open
Computer Vision Library, 2010) and the FLTK GUI
Library (FLTK Library, 2010). The application has
three working modes: 1) parameter setting, 2) skin
color filter creation, 3) skin color filter performance.
4.1 Parameter Setting
The application allows to select a series of
parameters that influence skin color filter creation.
The parameters are grouped in four sets:
preprocessing, histogram setting, Fuzzy ART neural
network setting, and skin color filter setting. Default
values of the parameters were fixed after many
experimental results. These values lead to a good
performance with images taken with different
conditions regarding the webcam, the illumination,
and user.
The main preprocessing parameters are cr, cg,
and cb. They fix the contribution of each component
in the luminosity calculation of each pixel.
The histogram setting allows to specify whether
a selected skin region is too bright, bright, normal,
dark, or too dark.
The main parameters of the Fuzzy ART neural
network setting are:
Num_rounds: number of times that the selected
region is introduced in the neural network.
Default value is 2.
Rho1: vigilance parameter of the network in the
training stage. Values closed to 0 imply fewer
ICFC 2010 - International Conference on Fuzzy Computation
426
categories because the network groups with
little strict similarity criteria. Values closed to 1
imply more categories, each one with very
similar patterns. Default value is 0.9.
Rho2: vigilance parameter of the network in the
testing stage. Values closed to 0 imply little
strict similarity criteria. Values closed to 1 make
the patterns be very similar to the created
categories to be included within them. Default
value is 0.9.
The main parameters in the skin color filter creation
are:
beta: percentage of the patterns that a category
has to exceed with respect to the patterns in the
category with the largest number of colors so
that the category can be considered skin color
category. Default value is 0.02.
delta: percentage of patterns that the category
with the largest number of colors classified by
the Fuzzy ART network has to exceed. Large
values of delta imply that the regions have to be
very homogeneous regarding color. Default
value is 0.05.
epsilon: upper limit of the Euclidean distance
between a category and the biggest category so
that the category can be considered in the
classification process of the colors. Large values
of epsilon imply that the category can be more
separated so that more different categories are
taken into account. Default value is 0.2.
mu: contribution of the saturation component of
the TSL color space in the Euclidean distance
calculation among categories. Default value is
1.
After selecting the values of all the parameters, a file
can be saved with this information so that these
values can then be used in the filter creation.
4.2 Skin Color Filter Creation
In this mode, a frame of a video sequence captured
by a webcam can be selected. In this frame, the skin
regions have to be manually selected. With the
computer mouse, the user selects the vertexes of the
polygonal region that determine each skin region in
the image. If some regions are selected within other
previously selected region, the interior regions will
be excluded from the exterior region. This is
interesting if the user wants to select the face
excluding the eyes, lips, or beard. To begin the
process of creating a skin filter, a parameter
configuration file has to be selected. With the values
that appear in this file and the selected region, the
Fuzzy ART neural network is trained.
The patterns used to train the network are the
chromaticity components (T and S) of the pixels in
the selected region because they can characterize
skin color optimally as explained in Section 2.
In the training process, the categories will be
committed as the chromaticity components of the
selected region be introduced in the Fuzzy ART
neural network a number of times specified by
num_rounds. After the training stage, a series of
committed categories that fulfill the rules fixed by
the parameters are associated with the skin color
characterized by the selected region. In the testing
stage, all the possible values of the chromaticity
components are introduced to the trained network.
From the RGB color space, 8 bits are used to
represent each color coordinate, so that there are
2
3x8
=16,777,216 colors. Each color in the RGB
space is converted to the TSL space. Each pair of T
and S values are introduced to the trained network so
that it is considered skin color (with all the possible
values of L) if it is assigned a committed skin color
category. From the testing of all the 16,777,216
colors in the RGB color space, a LUT is created to
give each color one of two different outputs, skin or
non-skin color. Thus, the created LUT can be used
to make real-time skin segmentation in videos
captured by a webcam. For each LUT (skin color
filter), the application saves in a log file the
information regarding the skin filter creation
including configuration parameters, the number of
pixels used to train the network, the number of
created categories, and the number of skin colors.
4.3 Skin Color Filter Performance
The application allows to compare the performance
of two skin color filters applied to the same video.
Fig. 1 shows the result of two different filters. The
pixels that are not colored (yellow on the left part
and green on the right part) are the ones that the
corresponding skin filter categorized as skin. The
left image is the result of using a filter created with
rho1=0.95 and rho2=0.9. The right image is the
result of using a filter created with rho1=0.9 and
rho2=0.9. The remaining parameters for creating
both filters were the default values. The bigger rho1
is, the more categories Fuzzy ART generates. This
way, with rho1=0.9, 12 categories were generated
and with rho1=0.95, 26 categories were generated. It
can be observed in Fig. 1 that rho1=0.9 gives rise to
a better skin segmentation.
Although skin color detection is good in both
images, with rho1=0.95 the larger number of created
categories causes that a significant part of the image
that does not correspond to skin is incorrectly
SELF ORGANIZING NEURAL NETWORK APPLICATION FOR SKIN COLOR SEGMENTATION
427
segmented as skin, giving rise to an incorrect image
segmentation.
Fig. 2 shows the results of using two filters that
differ in the beta parameter used in their creation
(left image with beta =0.1 and right image with
beta=0.02). The remaining parameters for creating
both filters were the default values. In the video, the
background has some colors similar to skin color, so
that the more restricted value of beta (0.1) gives rise
to a better segmentation because the incorrect
categorization of background colors as skin color is
greatly reduced. In contrast, there is a small portion
of the face that is not categorized as skin due to the
bright caused by the illumination. This bright facial
region is better categorized by the filter with a less
restricted value of beta (0.02).
Figure 1: Skin color filter comparison.
Figure 2: Skin color filter comparison.
5 CONCLUSIONS
In this paper, a novel skin color filter application
based on Fuzzy ART neural network is presented.
From the chromaticity components of the TSL color
space of the pixels belonging to skin regions in an
image, a Fuzzy ART neural network is trained. All
the colors are then categorized by the neural network
as belonging to skin or not. Finally, a LUT is created
by assigning one out of two different outputs, skin or
non-skin color to all the colors in the RGB color
space.
Although the skin colors present certain invariance
regarding chromaticity components, changes in the
illumination, camera or the motion of the human
body parts can have a significant impact on skin
color appearance. Even though the presented
application can be used to manually selected skin
regions, it can be adapted so that after an initial face
or hands detection in an image from a video
sequence, the optimal skin color filter can be created
for the tracking of the skin regions in successive
frames of the video sequence.
A created skin color filter makes real-time
tracking robust to changes in illumination and
appearance in videos taken in a non-controlled
environment. The developed application allows to
create skin color filter to make robust human body
parts monitoring.
REFERENCES
Carpenter, G.A. and Grossberg, S. (1987). A massively
parallel architecture for a self-organizing neural
pattern recognition machine. Computer Vision,
Graphics and Image Processing (CVGIP), 37(1): 54–
115.
Carpenter, G.A., Grossberg, S., Rosen, D.B. (1991). Fuzzy
ART: Fast stable learning and categorization of analog
patterns by an adaptive resonance system. Neural
Networks, 4(6): 759–771.
Chen, D., Liu, Z. (2003). A novel approach to detect and
correct highlighted face region in color image. In:
Proceedings of the IEEE conference on advanced
video and signal based surveillance, 7–12.
FLTK Library. http://www.fltk.org. 2010.
Fu Jie Huang, Tsuhan Chen (2000). Tracking of multiple
faces for human-computer interfaces and virtual
environments. In: Proceedings of the IEEE
International Conference on Multimedia and Expo
2000, Vol. 3, 1563–1566.
González-Ortega, D., Díaz-Pernas, F.J., Martínez-
Zarzuela, M., Antón-Rodríguez, M., Díez-Higuera,
J.F., Boto-Giralda, D. (2010). Real-time hands, face
and facial features detection and tracking: Application
to cognitive rehabilitation tests monitoring. Journal of
Network and Computer Application, 33(4), 447–466.
Kakumanu P., Makrogiannis S., Bourbakis N. (2007). A
survey of skin-color modeling and detection methods.
Pattern Recognition, 40(3), 1106–1122.
Open Computer Vision Library.
http://sourceforge.net/projects/opencvlibrary. 2010.
Phung S.L., Bouzerdoum A., Sr., Chai D., Sr. (2005). Skin
segmentation using color pixel classification: analysis
and comparison. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 27(1): 148–154.
Schmugge, S.J., Adeel Zaffar, M., Tsap. L.V., Shin, M.C.
(2007). Task-based evaluation of skin detection for
communication and perceptual interfaces. Journal of
Visual Communication and Image Representation,
18(6), 487–495.
ICFC 2010 - International Conference on Fuzzy Computation
428