SELF ORGANIZING NEURAL NETWORK APPLICATION FOR

SKIN COLOR SEGMENTATION

González-Ortega, F. J. Díaz-Pernas, M. Antón-Rodríguez, M. Martínez-Zarzuela

I. de la Torre-Díez, J. F. Díez-Higuera and D. Boto-Giralda

Department of Signal Theory, Communications and Telematics Engineering

Telecommunications Engineering School, University of Valladolid, Valladolid, Spain

Keywords: Fuzzy ART Neural Network, Skin Color Filter, TSL Color Space.

Abstract: In this paper, we present a Fuzzy ART (Adaptive Resonance Theory) neural network application for skin

color segmentation using the chromaticity components of the TSL color space. The Fuzzy ART networks

deal with the stability-plasticity dilemma and they can be applied to color image segmentation, particularly

to skin color segmentation. The developed application has three modes: parameter setting, skin color filter

creation, and skin color filter performance. Many parameters can be tuned to create proper skin color filters

from manually selected skin regions in an image. A skin color filter is a LUT (Look-Up Table) that gives

each color in the RGB color space, one of two different outputs, skin or non-skin color. The performance of

different skin color filters can be compared with the application. A skin color filter can be used to make

robust real-time skin color segmentation in video sequences captured by a webcam.

1 INTRODUCTION

Many computer vision applications such as

surveillance and access control systems, person

identification and Human-Computer Interfaces

(HCI) require the detection, localization, and

tracking of human body parts. These tasks in a

complex computer vision application stress the

importance of a low processing time as real-time

working is usually a constraint in the global

application. Moreover, reliable computer vision

applications need robustness to changes in

illumination, background, and objects of interest.

In this context, color is widely used for the

detection and tracking of human body parts as it is

robust to geometric changes and fast to calculate.

Phung et al. (2005) presented an analysis of three

important issues for the skin color segmentation:

color representation, color quantization, and

classification algorithm. Schmugge et al. (2007)

studied the performance of five color space

transformations with and without the luminosity

component for skin segmentation.

In this paper, we present a novel application to

make skin color filters from the chromaticity

components of the TSL color space of the pixels

within manually selected skin regions in an image. A

skin color filter is a LUT so that every color in the

RGB color space is tagged as skin or non-skin color.

The application to create the LUT is based on a

Fuzzy ART neural network (Carpenter et al., 1991).

This is a self organizing neural network

characterized for being stable and plastic and it is

suitable for color image segmentation. The created

skin color filter can be used to make skin color

segmentation in video sequences captured by an

inexpensive Universal Serial Bus camera robustly to

changes in illumination and motion of the human

body parts in an unrestricted environment.

The rest of the paper is organized as follows. In

Section 2, we present the literature regarding skin

color filtering and the TSL color space, whose

chromaticity components are used to characterized

skin regions in images. Section 3 deals with the

Fuzzy ART neural network and the reasons why we

use it to classify skin colors. Next, the skin color

filter application is presented in Section 4 along with

its three working modes: parameter setting, skin

color filter creation, and skin color filter

performance. Finally, Section 5 draws the

conclusions about the developed skin color

application.

425

González-Ortega D., J. Díaz-Pernas F., Antón-Rodríguez M., Martínez-Zarzuela M., de la Torre-Díez I., F. Díez-Higuera J. and Boto-Giralda D..

SELF ORGANIZING NEURAL NETWORK APPLICATION FOR SKIN COLOR SEGMENTATION.

DOI: 10.5220/0003083304250428

In Proceedings of the International Conference on Fuzzy Computation and 2nd International Conference on Neural Computation (ICNC-2010), pages

425-428

ISBN: 978-989-8425-32-4

 2010 SCITEPRESS (Science and Technology Publications, Lda.)

2 SKIN COLOR SEGMENTATION

Color filtering is a powerful tool in computer vision

tasks. It is a low level feature, highly discriminative,

computationally fast, and robust to geometric

changes that can be applied in the definition of

human body parts. Many studies evaluating color

spaces for skin detection have been carried out

(Kakumanu et al., 2007; Phung et al., 2005). Color

can be decomposed into three different components,

one luminosity and two chromaticity components.

Although skin color can notably change from some

human being to other or even in the same human

being due to factors such as a suntan, a blush, etc.,

several researches have proved that skin colors have

a certain invariance regarding chromaticity

components, although skin colors belong to people

of different ethnic groups (Fu Jie Huang and Tsuhan

Chen, 2000). Other factors such as lighting or skin

tone affect mainly the luminosity component.

Different color spaces separating luminosity and

chromaticity components have been used for skin

color segmentation (YIQ, CIE-LAB, CIE-LUV,

HSV, HIS, and TSL) (Kakumanu et al., 2007; Phung

et al., 2005). TSL has been selected as the best color

space to extract skin color from complex

backgrounds (Chen and Liu, 2003) because it has the

advantage of extracting a given color robustly while

minimizing illumination influence.

In our application, the chromaticity components

of the TSL color space are used to create skin color

filter.

3 FUZZY ART NEURAL

NETWORKS

The ART neural network architecture is a self-

organizing network that allows to switch between a

learning or plastic state (in which the network

parameters may be modified) and a stable or fixed

state for operation (Carpenter and Grossberg, 1987).

Fuzzy ART is an extension of the ART1 system

(Carpenter et al., 1991) that allows analog inputs

with values between 0 and 1. The most important

point in the application of an ART network to color

image segmentation, and especially skin color

segmentation, is the stability-plasticity dilemma. The

ART network has plasticity in order to learn new

information, while it has stability in order not to

forget the already learned information. ART

networks are able to obtain stability without

sacrificing plasticity (Carpenter and Grossberg,

1987).

In our application, a Fuzzy ART neural network

is used to categorize each color as skin or non-skin.

4 SKIN COLOR FILTER

APPLICATION

We have developed a skin color filter application so

that a filter, which classifies every color as being

skin color or not, is created from skin regions of an

image manually selected by the user. A Fuzzy ART

neural network is used to create the skin color filter.

The inputs to the neural network are the

chromaticity components (T and S) of the TSL color

space of all the pixels inside the skin region of an

image. The TSL color space was selected as it

achieved the best skin color filtering performance in

an experimental study including the normalized rgb,

YIQ, HIS, CIELUV, and TSL space (González-

Ortega et al., 2010).

The application has been developed using the

Intel Open Source Computer Vision Library (Open

Computer Vision Library, 2010) and the FLTK GUI

Library (FLTK Library, 2010). The application has

three working modes: 1) parameter setting, 2) skin

color filter creation, 3) skin color filter performance.

4.1 Parameter Setting

The application allows to select a series of

parameters that influence skin color filter creation.

The parameters are grouped in four sets:

preprocessing, histogram setting, Fuzzy ART neural

network setting, and skin color filter setting. Default

values of the parameters were fixed after many

experimental results. These values lead to a good

performance with images taken with different

conditions regarding the webcam, the illumination,

and user.

The main preprocessing parameters are cr, cg,

and cb. They fix the contribution of each component

in the luminosity calculation of each pixel.

The histogram setting allows to specify whether

a selected skin region is too bright, bright, normal,

dark, or too dark.

The main parameters of the Fuzzy ART neural

network setting are:

• Num_rounds: number of times that the selected

region is introduced in the neural network.

Default value is 2.

• Rho1: vigilance parameter of the network in the

training stage. Values closed to 0 imply fewer

ICFC 2010 - International Conference on Fuzzy Computation

426

categories because the network groups with

little strict similarity criteria. Values closed to 1

imply more categories, each one with very

similar patterns. Default value is 0.9.

• Rho2: vigilance parameter of the network in the

testing stage. Values closed to 0 imply little

strict similarity criteria. Values closed to 1 make

the patterns be very similar to the created

categories to be included within them. Default

value is 0.9.

The main parameters in the skin color filter creation

are:

• beta: percentage of the patterns that a category

has to exceed with respect to the patterns in the

category with the largest number of colors so

that the category can be considered skin color

category. Default value is 0.02.

• delta: percentage of patterns that the category

with the largest number of colors classified by

the Fuzzy ART network has to exceed. Large

values of delta imply that the regions have to be

very homogeneous regarding color. Default

value is 0.05.

• epsilon: upper limit of the Euclidean distance

between a category and the biggest category so

that the category can be considered in the

classification process of the colors. Large values

of epsilon imply that the category can be more

separated so that more different categories are

taken into account. Default value is 0.2.

• mu: contribution of the saturation component of

the TSL color space in the Euclidean distance

calculation among categories. Default value is

After selecting the values of all the parameters, a file

can be saved with this information so that these

values can then be used in the filter creation.

4.2 Skin Color Filter Creation

In this mode, a frame of a video sequence captured

by a webcam can be selected. In this frame, the skin

regions have to be manually selected. With the

computer mouse, the user selects the vertexes of the

polygonal region that determine each skin region in

the image. If some regions are selected within other

previously selected region, the interior regions will

be excluded from the exterior region. This is

interesting if the user wants to select the face

excluding the eyes, lips, or beard. To begin the

process of creating a skin filter, a parameter

configuration file has to be selected. With the values

that appear in this file and the selected region, the

Fuzzy ART neural network is trained.

The patterns used to train the network are the

chromaticity components (T and S) of the pixels in

the selected region because they can characterize

skin color optimally as explained in Section 2.

In the training process, the categories will be

committed as the chromaticity components of the

selected region be introduced in the Fuzzy ART

neural network a number of times specified by

num_rounds. After the training stage, a series of

committed categories that fulfill the rules fixed by

the parameters are associated with the skin color

characterized by the selected region. In the testing

stage, all the possible values of the chromaticity

components are introduced to the trained network.

From the RGB color space, 8 bits are used to

represent each color coordinate, so that there are

3x8

=16,777,216 colors. Each color in the RGB

space is converted to the TSL space. Each pair of T

and S values are introduced to the trained network so

that it is considered skin color (with all the possible

values of L) if it is assigned a committed skin color

category. From the testing of all the 16,777,216

colors in the RGB color space, a LUT is created to

give each color one of two different outputs, skin or

non-skin color. Thus, the created LUT can be used

to make real-time skin segmentation in videos

captured by a webcam. For each LUT (skin color

filter), the application saves in a log file the

information regarding the skin filter creation

including configuration parameters, the number of

pixels used to train the network, the number of

created categories, and the number of skin colors.

4.3 Skin Color Filter Performance

The application allows to compare the performance

of two skin color filters applied to the same video.

Fig. 1 shows the result of two different filters. The

pixels that are not colored (yellow on the left part

and green on the right part) are the ones that the

corresponding skin filter categorized as skin. The

left image is the result of using a filter created with

rho1=0.95 and rho2=0.9. The right image is the

result of using a filter created with rho1=0.9 and

rho2=0.9. The remaining parameters for creating

both filters were the default values. The bigger rho1

is, the more categories Fuzzy ART generates. This

way, with rho1=0.9, 12 categories were generated

and with rho1=0.95, 26 categories were generated. It

can be observed in Fig. 1 that rho1=0.9 gives rise to

a better skin segmentation.

Although skin color detection is good in both

images, with rho1=0.95 the larger number of created

categories causes that a significant part of the image

that does not correspond to skin is incorrectly

SELF ORGANIZING NEURAL NETWORK APPLICATION FOR SKIN COLOR SEGMENTATION

427

segmented as skin, giving rise to an incorrect image

segmentation.

Fig. 2 shows the results of using two filters that

differ in the beta parameter used in their creation

(left image with beta =0.1 and right image with

beta=0.02). The remaining parameters for creating

both filters were the default values. In the video, the

background has some colors similar to skin color, so

that the more restricted value of beta (0.1) gives rise

to a better segmentation because the incorrect

categorization of background colors as skin color is

greatly reduced. In contrast, there is a small portion

of the face that is not categorized as skin due to the

bright caused by the illumination. This bright facial

region is better categorized by the filter with a less

restricted value of beta (0.02).

Figure 1: Skin color filter comparison.

Figure 2: Skin color filter comparison.

5 CONCLUSIONS

In this paper, a novel skin color filter application

based on Fuzzy ART neural network is presented.

From the chromaticity components of the TSL color

space of the pixels belonging to skin regions in an

image, a Fuzzy ART neural network is trained. All

the colors are then categorized by the neural network

as belonging to skin or not. Finally, a LUT is created

by assigning one out of two different outputs, skin or

non-skin color to all the colors in the RGB color

space.

Although the skin colors present certain invariance

regarding chromaticity components, changes in the

illumination, camera or the motion of the human

body parts can have a significant impact on skin

color appearance. Even though the presented

application can be used to manually selected skin

regions, it can be adapted so that after an initial face

or hands detection in an image from a video

sequence, the optimal skin color filter can be created

for the tracking of the skin regions in successive

frames of the video sequence.

A created skin color filter makes real-time

tracking robust to changes in illumination and

appearance in videos taken in a non-controlled

environment. The developed application allows to

create skin color filter to make robust human body

parts monitoring.

REFERENCES

Carpenter, G.A. and Grossberg, S. (1987). A massively

parallel architecture for a self-organizing neural

pattern recognition machine. Computer Vision,

Graphics and Image Processing (CVGIP), 37(1): 54–

115.

Carpenter, G.A., Grossberg, S., Rosen, D.B. (1991). Fuzzy

ART: Fast stable learning and categorization of analog

patterns by an adaptive resonance system. Neural

Networks, 4(6): 759–771.

Chen, D., Liu, Z. (2003). A novel approach to detect and

correct highlighted face region in color image. In:

Proceedings of the IEEE conference on advanced

video and signal based surveillance, 7–12.

FLTK Library. http://www.fltk.org. 2010.

Fu Jie Huang, Tsuhan Chen (2000). Tracking of multiple

faces for human-computer interfaces and virtual

environments. In: Proceedings of the IEEE

International Conference on Multimedia and Expo

2000, Vol. 3, 1563–1566.

González-Ortega, D., Díaz-Pernas, F.J., Martínez-

Zarzuela, M., Antón-Rodríguez, M., Díez-Higuera,

J.F., Boto-Giralda, D. (2010). Real-time hands, face

and facial features detection and tracking: Application

to cognitive rehabilitation tests monitoring. Journal of

Network and Computer Application, 33(4), 447–466.

Kakumanu P., Makrogiannis S., Bourbakis N. (2007). A

survey of skin-color modeling and detection methods.

Pattern Recognition, 40(3), 1106–1122.

Open Computer Vision Library.

http://sourceforge.net/projects/opencvlibrary. 2010.

Phung S.L., Bouzerdoum A., Sr., Chai D., Sr. (2005). Skin

segmentation using color pixel classification: analysis

and comparison. IEEE Transactions on Pattern

Analysis and Machine Intelligence, 27(1): 148–154.

Schmugge, S.J., Adeel Zaffar, M., Tsap. L.V., Shin, M.C.

(2007). Task-based evaluation of skin detection for

communication and perceptual interfaces. Journal of

Visual Communication and Image Representation,

18(6), 487–495.

ICFC 2010 - International Conference on Fuzzy Computation

428