FACIAL POLYGONAL PROJECTION

A new feature extracting method to help in neural face detection

Adriano Martins Moutinho, Antonio Carlos Gay Thomé

Núcleo de computação eletrônica, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brasil

Pedro Henrique Gouvêa Coelho

Faculdade de Engenharia, Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brasil.

Keywords: Face detection, polygonal projection, neural network, image processing.

Abstract: Locating the position of a human face in a photograph is likely to be a very complex task, requiring several

image and

signal processing techniques. This paper proposes a new approach called polygonal facial

projection that is able, by measuring specific distances on the image, to extract relevant features and

improve efficiency of neural face identification systems (Rowley, 1999) (Moutinho and Thomé, 2004),

facilitating the separation of facial patterns from other objects present in the image.

1 INTRODUCTION

In order to successfully recognize and identify a

human face, it is first necessary to find the position

of the face in a wider scenario, where other objects

and a complex background are oftenly present.

(Rowley, 1999) and (Moutinho and Thomé,

004) proposed neural face identification systems

that employed neural network and several image

preprocessing techniques.

In the aforementioned works, artificial neu

ral

networks (ANN) (Haykin, 1994) were fed with

luminance values of a grayscale image. Based on

these values the ANN could decide if the image

constituted a human face or not.

However, articles such as (Vianna and

drigues, 2000) show that it is possible to improve

neural network recognition capabilities by extracting

specific image features, instead of simply using

luminance values directly as ANN input.

This paper proposes a modification in the neural

net

work detection schemes proposed in (Rowley,

1999) and (Moutinho and Thomé, 2004), by

introducing a new technique to extract image

characteristics – the polygonal projection.

The polygonal projection used in this paper is an

adapt

ation of the method proposed in (Vianna and

Rodrigues, 2000) to aid in face identification. It is

based on the measurement of distances between the

image limits and specific singular points.

The adaptation of polygonal projection presented

here

can be used in other pattern identification

problems using neural networks. It is adequate to be

used in any grayscale images, not only on binary

black and white images as the original technique and

other projection methods.

Section 3 describes the polygonal projection

ethod and section 4 shows results and conclusions.

More details about the face detection system can

be f

ound in (Moutinho and Thomé, 2004), polygonal

projection described in this paper improves its

capability.

2 FACE DETECTION SYSTEM

The face detection system can be divided in four

stages, as shown in figure 1.

The windowing process splits the original figure

n several squared subimages. Every subimage is

considered a face candidate.

The preprocessing and encoding phase is a

llection of processes applied in order to adapt the

original image to a neural network input. Polygonal

projection is one of these processes.

The neural network phase represents a Multi

Layer Pe

rceptron (MLP) previously trained to detect

facial patterns, and the last phase performs a fine

adjustment in face position. More details about the

face identification system can be found in (Moutinho

and Thomé, 2004).

419

Martins Moutinho A., Carlos Gay Thomé A. and Henrique Gouvêa Coelho P. (2005).

FACIAL POLYGONAL PROJECTION - A new feature extracting method to help in neural face detection.

In Proceedings of the Seventh International Conference on Enterprise Information Systems, pages 419-422

DOI: 10.5220/0002539404190422

 SciTePress

Figure 1: Face detection system

3 POLIGONAL PROJECTION

When a neural network is used in image pattern

recognition, it is necessary to extract features from

the images to feed the neural network.

Although the bitmap is often used as features to a

neural network, sometimes it may not be adequate.

Changes in size, position or rotation in the image are

likely to change most values in the bitmap, but are

unlikely to change the class of the image.

If, for example, a face image is slightly shifted to

the left, most features extracted from bitmap values

will probably change. However, that will not change

the fact that it is a face. The neural network will

have to generalize these differences in order to

achieve good results.

This paper proposes a method of feature

extraction to help in image size and position

normalization. It was previously used in (Vianna and

Rodrigues, 2000) and it is called polygonal face

projection. In (Vianna and Rodrigues, 2000), a black

and white handwritten character is placed inside of a

polygon. Then, distances between the polygon’s

sides and the first black pixel in the character are

computed. A set of these distances is used to

represent the character to the neural network. This

method, called polygonal character projection,

improves generalization in neural network character

recognition (Vianna and Rodrigues, 2000).

However, it is not possible to apply the

polygonal projection proposed in (Vianna and

Rodrigues, 2000) in neural face detection. Faces are

grayscale images, and changing it to black and white

will probably cause the loss of relevant information.

This paper proposes an adaptation on the

polygonal character projection method to allow its

use in a grayscale image, where there is no simple

method to measure the distances between the first

black pixel and the polygon’s side. A concept of

projection energy is created.

Projection energy is a number, previously

defined, that will be subtracted from image

luminance values in a certain projection direction.

Let figure 2 defines a projection direction; the

energy value will be subtracted by luminance pixels

values in the direction of arrow in figure 2.

Figure 2: Projection direction in a face.

Thus, from the border of the image in figure 2,

energy will be subtracted by luminance pixels in the

projection direction. When the luminance of a new

pixel is subtracted and the resultant energy becomes

zero or less, a distance between the initial point and

the zero point is computed, this is the projection

distance and it is considered a feature extracted from

the original face.

Polygonal projection with energy concept is

related to x-ray feature extraction used in medicine.

In this case, an x-ray emitter will sensibilize a

special film according to blocking characteristics in

the objects. Bones, for example, usually block x-ray

emission, making the film white.

In a polygonal face projection, higher values of

luminance will block projection and result in lower

distance values. On the other hand, if only lower

luminance pixels are found in the projection

direction, the distance extracted will be higher.

In the case of face detection, inverting the image

before extracting distances using polygonal

projection could lead to better results. As a result of

image inversion, black areas will block projection

and white areas will not.

The motivation for the inversion of the image

can be seen in figure 2. The eyes’ position is likely

to be darker than the rest of face image, which

facilitates their detection, since it will probably

block projection. The mouth and nose area also

likely to be darker than the rest of the face image.

Image inversion will make face features such as

eyes, mouth and nose to be detected by polygonal

projection, because it will block projection.

Detecting eyes, mouth and nose position is an

important step to successfully detect a face.

Another adaptation in the polygonal projection

method is to square every element in the original

image before projection. Squaring elements will

reduce very much the values in the range between 0

and 0.5, and will prevent that sequences of lower

luminance values reduce energy. Only values higher

than 0.5 will continue to block projection.

Squaring elements in the original image is also

related to the way x-ray emission is exponentially

attenuated by objects (Jain, 1989).

As in (Vianna and Rodrigues, 2000), choosing

the polygon defines all projection directions. In this

paper, a square will be used as the base polygon;

distances will then be extracted orthogonally to the

square sides, as show in figure 3.

ICEIS 2005 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS

420

Figure 3: Projection directions using a square as the base

polygon

However, face images are likely to have

information in the picture center that is relevant to

classification. In order to avoid loosing central

information the image is divided in two parts

horizontally and two parts vertically, so it’s possible

to extract projections again, as described in figure 4.

Figure 4: Directions of polygonal projection dividing

an image in four parts

Since projections are extracted line by line or

column by column over the image, an HxV sized

image will have a number of features equal to:

VHN

ojections

⋅+⋅

The number of features extracted in image

defines the size of the neural network (Haykin,

1994). It’s important to reduce data size to have fast

training and processing methods. In a square image

where

, polygonal projection will extract less

features than bitmap considering images bigger than

8x8. The database used in this paper, shown in

figure 3, is 19x19 sized. The number of extracted

features will be 152 and the number of values in the

luminance matrix will be 361.

VH =

Although the number of features extracted will

be higher using bitmap, it’s possible to show that

polygonal projection features are more relevant to

classification because generalization is improved, as

will be shown in section 4.

Principal Components Analysis (PCA) suggests

that polygonal extraction conveys more information

than bitmap. PCA projects data in a new space

where new variables are statistically uncorrelated

(Zurada, 1999) (Haykin, 1994). By using PCA it’s

possible to reduce the size of training data by

removing low variance variables of the new data

space, keeping most of the original relevant

information.

Table 1 shows PCA application in a database of

face images containing 5000 examples. First PCA is

applied in the database using bitmap features, then in

a database using polygonal projection. In both cases

the database is normalized to have zero mean and

unitary variance (Zurada, 1999) (Haykin, 1994).

Using bitmap features, PCA projection can

reduce the number of variables from 361 to 58,

keeping 99% of relevant information. That shows

bitmap representation conveys too much irrelevant

information.

On the other hand, PCA application does not

reduce very much the number of variables using

polygonal features. That means polygonal projection

is able to keep more information, and might result in

fast training and better generalization.

Table 1: PCA application in database

PCA tests Bitmap

Poligonal

Projection

Original Size 361 152

Size using PCA

with 1% lost

information

58 126

Reduction % 83,9% 19,2%

Figures 9 and 10 show an example of polygonal

projection using initial projection energy equal to

“1”. White pixels in the image will let the projection

pass and black pixels will block projection. The face

image in figures 9 and 10 are not shown inverted or

with squared elements, but projections are taken

using these modifications. In figure 5a, the first four

sets of projections is taken. Each plot has 19 points

that indicate each projection distance. Figure 5b

shows the projections taken by splitting the original

face horizontally and vertically.

In figure 6a and 6b, the same polygonal distances

are extracted from a non-face figure. It’s possible to

compare projections from face and non-faces images

and observe that figures 9a and 9b show face

characteristics such as symmetry, position of eyes

and mouth. On the other hand, there is no symmetry

in the plots of figures 10a and 10b, facilitating face

patterns identification.

Figure 5a: Example of polygonal projection in a face

image. First set of four projections

FACIAL POLYGONAL PROJECTION - A new feature extracting method to help in neural face detection

421

Figure 5b: Example of polygonal projection splitting the

face image horizontal and vertically

Figure 6a: Example of polygonal projection in a non-face

image. First set of four projections

Figure 6b: Example of polygonal projection splitting the

non-face image horizontal and vertically

4 RESULTS AND CONCLUSION

To compare the performance of the polygonal

projection proposed in this paper, against the bitmap

feature extraction, a set of thirty-two neural network

architectures were used by changing the number of

hidden layers and activation functions (Haykin,

1994). Every neural network used was trained using

Backpropagation with adaptive learning rate and

momentum (Haykin, 1994).

Table 2 shows the mean results of the 32 neural

networks using test database. This database is not

used to adjust weights during neural network

training.

According to table 2, the use of polygonal

projection improves the correct face recognition rate

by approximately 18%, showing that the method is

capable of extracting features that are relevant to

classification, improving generalization.

Table 2: Mean results of 32 different ANN

Correct Recognition Rate

Results

Faces Non-faces

Bitmap 47,97% 96,83%

Polygonal

Projection

65,84% 92,07%

% Gain of 17,87% Loss of 4,76%

However, correct non-face recognition rate

decreases about 5% using the new method, but this

small reduction should not be considered a problem.

The face detection system proposed can eliminate

elements wrongly recognized as faces (Moutinho

and Thomé, 2004).

As results in table 2 shows, polygonal projection

improves face detection with a small reduction in

non-face rejection. As the database used in this test

contains rotated faces and faces with eyes in

different positions, it’s possible to conclude that

polygonal projection provides some normalization in

these non standard faces, increasing neural network

generalization.

REFERENCES

Rowley, A. Henry; Kanade 1999, Takeo. Neural network-

based face detection. ISBN 0-599-52020-5, 1999.

Moutinho, A. M.; Thomé, A. C. G.; Biondi, L. B. C.;

Coelho H. G.; Meza L. A., 2004. Face pattern

detection. Congress and author’s name removed for

blind evaluation purposes.

Haykin, Simon, 1994. Neural Networks: A

Comprehensive Foundation, Prentice Hall PTR, ISBN

0023527617.

Jain, A. K., 1989. Fundamentals of digital image

processing, Prentice-Hall Inc. ISBN 0-13-336165-9.

Gonzalez, Rafael C.; Woods, Richard E.; Digital Image

Processing, 1992. Addison-Wesley Longman

Publishing Co., Inc. ISBN 0201508036.

Zurada, J. M., 1999. Introduction to Artificial Neural

Systems. PWS Publishing Co., ISBN 053495460X.

Viana K. G.; Rodrigues J. R.; YYY, C. G. A., 2000.

Extração de Características para o Reconhecimento de

Dígitos. Congress and author’s name removed for

blind evaluation purposes.

ICEIS 2005 - ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS

422