Enhanced 3D Face Processing using an Active Vision System

Morten Lidegaard

, Rasmus F. Larsen

, Dirk Kraft

, Jeppe B. Jessen

, Richard Beck

Thiusius R. Savarimuthu

, Claus Gramkow

, Ole K. Neckelmann

, Jonas Haustad

and Norbert Kr

uger

Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark

TriVision, Odense, Denmark

Keywords:

Active Vision.

Abstract:

We present an active face processing system based on 3D shape information extracted by means of stereo

information. We use two sets of stereo cameras with different ﬁeld of views (FOV): One with a wide FOV is

used for face tracking, while the other with a narrow FOV is used for face identiﬁcation. We argue for two

advantages of such a system: First, an extended work range, and second, the possibility to place the narrow

FOV camera in a way such that a much better reconstruction quality can be achieved compared to a static

camera even if the face had been fully visible in the periphery of the narrow FOV camera. We substantiate these

two observations by qualitative results on face reconstruction and quantitative results on face recognition. As a

consequence, such a set-up allows to achieve better and much more ﬂexible system for 3D face reconstruction

e.g. for recognition or emotion estimation based on the characteristics of a given face.

1 INTRODUCTION

In this paper, we make use of active vision for enhanc-

ing the 3D reconstruction of a face to be used, e.g., in

the context of face and emotion recognition (see ﬁg-

ure 1). It is known that vision based face recognition

systems making use of 3D information are very pow-

erful since by means of the 3D shape of a face, highly

invariant and discriminative feature vectors can be ex-

tracted. One important reason for that is – compared

to 2D features – a high robustness against illumination

changes. However, such systems require a reasonable

resolution of the face and by that they are usually lim-

ited to a constrained area in which the face needs to

be visible. A possibility to overcome this problem is

to separate the problem of face ﬁnding - which usu-

ally requires a much lower resolution of the face in

the image – and the 3D face processing. We do that

by using a wide ﬁeld of view (FOV) camera for face

ﬁnding and a narrow FOV camera for face recogni-

tion (see Fig. 1a). Both cameras are placed on a pan-

tilt unit (PTU) where – based on the image informa-

tion of the wide FOV camera – the PTU is controlled

This work was funded by the project ’SenseBot (Wel-

fareTech, http://en.welfaretech.dk/)’ and supported by Wel-

fare Tech Region.

in a way that allows the narrow FOV camera to take

pictures of a centred face with higher pixel resolution

(see Fig. 2). Fig. 1b) through d) depicts two possi-

ble application areas where the use of an active vision

system could enhance the 3D face processing results.

Fig 1b) represents a face recognition system and c)

and d) represents face capturing as input for an emo-

tion estimation system e.g. based on the 3D facial

structure.

We want to stress that the focus of this paper is

on the role of active vision in face processing and

not on face recognition. We will show that the use

of active vision has two positive effects: First, as an

obvious fact, it allows for extending the operational

space of the system (see ﬁgure 3). Moreover, it also

enhances the 3D reconstruction by providing an im-

proved camera placement relative to the face, i.e., by

bringing the face into the area of optimal reconstruc-

tion uncertainty. The second point is purely based on

a geometric argument and is often overlooked.

One of the more popular approaches for face de-

tection is the Viola and Jones algorithm (Viola and

Jones, 2004) which has been adapted by (Cristi-

nacce and Cootes, 2006; Douxchamps and Campbell,

2008). Either the algorithm is used directly for face

detection or for feature detection of the face. The

466

Lidegaard M., Larsen R., Kraft D., Jessen J., Beck R., Savarimuthu T., Gramkow C., Neckelmann O., Haustad J. and Krüger N..

Enhanced 3D Face Processing using an Active Vision System.

DOI: 10.5220/0004667904660473

In Proceedings of the 9th International Conference on Computer Vision Theory and Applications (VISAPP-2014), pages 466-473

ISBN: 978-989-758-009-3

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

Figure 1: The physical construction of the active vision system showing the position of the mounted cameras (blue) and the

axes of the PTU (red) is depicted in a). Fig. b) show a face recognition system used for access allowance. Figures c) and d)

are showing tracked faces (left images) while watching a video (right image). c) facial expression during a happy video-scene

and d) during a more sad video-scene. Scene from The Lord of the Rings - The Fellowship of the Ring, New Line Cinema,

2001.

Viola and Jones algorithm is also utilised within the

work presented by this paper.

Face recognition is a widely studied area. Tolba

et al. (Tolba et al., 2006) provide a nice review of

the topic. Some early work utilising active vision

for face authentication in a 2D approach is presented

by Tistarelli et al. (Tistarelli and Grosso, 2000) as

a system used for authentication in banking applica-

tions. A pair of active vision cameras captures spe-

ciﬁc invariant facial features which are automatically

extracted utilising morphological processes. A simple

matching algorithm based on correlation between log-

polar images of the one requesting access and stored

images is used for the verifying process.

Another of the earlier utilisations of active sys-

tems in a 2D approach is presented by Darrell et

al. (Darrell et al., 1996). A real-time face tracking

system used for face tracking in an unconstrained in-

teractive environment by utilising an active foveated

system for extracting position and pose of the face by

analysing Eigen-space features of the face. The sys-

tem is based on a ﬁxed wide-angle scene camera pro-

viding feedback to a PTU controlled narrow-angled

camera tracking the face.

Of more recent work, (Kri

zaj et al., 2012) ad-

dresses the advantages of utilising 3D features for

robust face recognition in uncontrolled environments

using an active 3D sensor, e.g. the Microsoft Kinect

camera. Contrary to our approach, the system is based

on a stationary set-up utilising 3D sensors whereas

we utilise an active vision system combined with high

resolution stereo cameras for the 3D face reconstruc-

tion.

According to our knowledge, there is only one ap-

proach using active vision in the context of face pro-

cessing based on two stereo systems: (Utsumi et al.,

2012) utilises two pan-tilt-zoom (PTZ) camera sets

in combination with a stationary camera for ﬁnding

and tracking faces of persons traversing a path under

surveillance. High-level feedback from an inference

engine is used to determine the best captured camera

view of the tracked face for recognition. It is different

to our approach in the sense that they focus on the use

of PTZ cameras for face tracking. We have a single

PTU with two ﬁxed focal length pairs of stereo cam-

eras; one to track the face in 3D and another to per-

form high quality reconstruction as a preparatory step

for 3D shape based face recognition. This high qual-

ity 3D reconstruction allows for reliable face recogni-

tion, whereas (Utsumi et al., 2012) need a set of 2D

images for training as well as matching to get reliable

recognition.

In this paper, we show qualitatively as well as

quantitatively that by utilising an active vision sys-

tem we can 1) improve reconstruction quality, and by

that 2) improve recognition in the limited workspace

spanned by a passive system and 3) extend the

workspace considerably.

Enhanced3DFaceProcessingusinganActiveVisionSystem

467

Figure 2: Sketch of the overall data ﬂow through the system. Images from the wide FOV cameras a+b) provide the information

for rotating c) the high resolution cameras d) into position using the PTU for gaining centred images of the tracked face e) for

the 3D reconstruction f) and ﬁnally the recognition evaluation g).

2 SYSTEM OVERVIEW

The overall active vision system consist of two stereo

cameras mounted atop a pan-tilt unit. The stereo cam-

era having a wide FOV is used for face ﬁnding and

tracking in combination with the PTU. A geometrical

reconstruction of the point representing the center of

the face with respect to the face-tracking cameras is

used as reference point for rotating the second stereo

camera into an optimal position for capturing high

resolution images of the tracked face. Two separated

computer systems are utilised for the data processing;

one for the face ﬁnding and tracking and another for

the 3D face reconstruction and face recognition. The

physical construction of the active vision system in-

cluding the PTU and the mounted cameras is shown

in Fig. 1 a).

On the right side of Fig. 1a), the face track-

ing cameras are located and on the left, the high-

resolution cameras for the 3D reconstruction are

mounted vertically to avoid occlusion of the nose and

make as much of the face mutually visible for the

two high resolution cameras as possible. The physical

construction for mounting the cameras on the PTU is

a 3D printed CAD-model. The PTU has two distinct

axes of rotation indicated as the red dashed lines seen

in Fig. 1 a) and Fig. 2 c).

A schematic model of the overall system is

sketched in Fig. 2 including examples of images from

the two camera sets combined with indication of the

data-ﬂow through the system. Firstly the images from

the wide FOV cameras are utilised for face ﬁnding

and tracking. The location of the face in 3D space

is estimated and used as a reference point for rotat-

ing the high resolution cameras into position for cap-

turing high resolution images of the tracked face in

the centre of the image by means of the PTU. The

high resolution stereo images are then used for recon-

structing the face and generate a 3D face-model. Fea-

tures extracted from the generated 3D face model are

compared to existing entries in a database. Finally,

a matching evaluation is initiated for determining a

recognition or not.

In the following subsections, we ﬁrst address the

approach of the face detection followed by a descrip-

tion of the tracking system. Then, the face recogni-

tion is described and ﬁnally the combined system is

presented. For more details about the face tracking

system we refer to (Larsen, 2011).

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

468

2.1 Face Detection

The ﬁnding and tracking of a face is based on the

wide FOV stereo camera. The face tracking system

utilises the OpenCV implementation of the Viola and

Jones face detector (Viola and Jones, 2004) which

consists of groups of weak classiﬁers with high detec-

tions rates and low rejection rates. The weak classiﬁer

has a correct detection-rate just above chance. Several

groups of weak classiﬁers are then combined forming

a cascade. At any stage, when a rejection is encoun-

tered, the process exits (candidate not in class). Only

when all stages in the cascade of weak classiﬁers have

responded positively, a face-detection is declared (Vi-

ola and Jones, 2004).

The algorithm is trained for frontal faces and both

left and right proﬁle detections. In combination, the

system is capable of detecting the face of a panning

head. At this point only one face can be tracked at a

time and the system cannot handle head roll or faces

looking up or down.

2.2 The Tracking System

Whenever a face has been detected by the stereo cam-

era with the wide FOV, the centre point of the detected

face is geometrically reconstructed in 3D space. The

reconstructed centre point is then used as a refer-

ence point for rotating and repositioning the PTU and

hence the cameras. The positioning system utilises

a geometrical model of the physical tracking system,

including the two stereo cameras, and the reference

point to deﬁne the proper movements to reposition the

high resolution cameras directing them towards the

reconstructed centre point of the tracked face. Hence,

when a face is detected by the wide FOV cameras,

the high resolution cameras is directed towards the

tracked face. The PTU is controlled by constant speed

movement deﬁned by the error E

di f f erence

between the

current orientation P

current

and the desired orientation

desired

of the high resolution cameras as indicated in

equation 1. The position of the cameras is updated

with a frequency of approximately 5 Hz. Both the

pan and tilt dimensions are included in the current po-

sition, desired position and error difference.

desired

= P

current

+ E

di f f erence

(1)

A geometric interpretation of the tracking is seen

in Fig. 3. Only the pan-dimension is shown but the

principle is the same for the tilt-direction.

Figure 3: Principle of the active vision system during face

tracking. A face is detected by the wide FOV (marked in

red) cameras shown in a) and the PTU rotates and reposition

the high resolution cameras (marked in blue) directing them

towards the centre of the tracked face as shown in b).

2.3 3D Facial Processing

Face recognition by means of matching a given face

to a database of faces, is a non-intrusive biometric

method that dates back several decades. In the last

years, there has been a renewed interest in develop-

ing new methods for automatic face recognition. This

renewed interest has been fuelled by advances in com-

puter vision techniques, computer design, sensor de-

sign, and face recognition systems. 3D face recogni-

tion algorithms identify faces from the 3D shape of

a person’s face. Face recognition systems not based

on 3D information are affected by changes in lighting

(illumination) and pose of the face which reduce per-

formance. Because the shape of faces is not affected

by changes in lighting or pose, 3D face recognition

has the potential to improve performance under these

conditions (Jafri and Arabnia, 2009).

In our system, we perform the following steps:

Firstly a 3D face model must be obtained. Two com-

mon approaches are stereo-imaging and the use of

structured light sensors, e.g. the Microsoft Kinect.

Once the 3D model is obtained, invariant measures

can be extracted. One approach described in the liter-

ature (Mata et al., 2007) computes geodesic distances

between sampled points on the facial surface. Based

on these distances, the points are then ﬂattened into a

low-dimensional Euclidean space, providing a bend-

ing invariant (or isometric invariant) signature surface

that is robust to certain facial expressions. Finally, the

signature is compared with a database of signatures.

The high resolution cameras utilised for the 3D

reconstruction and recognition part are acquiring im-

ages in continuous mode. For each pair of images a

face detector – also the OpenCV implementation of

the Viola and Jones face detector (Viola and Jones,

2004) – is checking if a face is present in the image.

If a face is detected in both images of the stereo cam-

era, the position of the face is compared with the face

Enhanced3DFaceProcessingusinganActiveVisionSystem

469

position in the next image pair. If sufﬁcient stability

is detected between the positions of the face in the

consecutive image pairs, it is assumed that no motion

blur occurs and the current stereo image is transferred

to the 3D modelling system. The system generates

a 3D model from the stereo image by use of stereo

reconstruction.

When a 3D model has been generated, a shape

model based on a number of controlling parame-

ters is ﬁtted to the 3D model. This approach is

based on the active appearance model (AAM) intro-

duced by (Cootes et al., 1998). The position, orien-

tation and model weight parameters are ﬁtted to the

reconstructed 3D-model to minimize the 3D resid-

ual between the shape model surface and the 3D-

reconstruction. The ﬁtting process is performed itera-

tively based on the updated shortest distance between

the source and target models. The standard deviations

of the derived shape model are balanced to avoid over-

ﬁtting. The principle of the ﬁrst part of the system is

sketched in Fig. 4 and Fig. 5 (ﬁrst four steps until pa-

rameters). In the formula for the shape model, x is

the ﬁtted shape model of a speciﬁc person, whereas

on the right hand side

x is the shape model of an av-

erage face. P

is the principal components of how the

face model changes according to a training set, and

the weights b

represents a key or the ID of the spe-

ciﬁc person.

Figure 4: Shape model parameter ﬁtting.

When the ﬁtting procedure has ﬁnished, the result-

ing set of parameters (which include the facial scale,

and the contribution of each of the active shape model

modes used to model the face) is sent to the recogni-

tion module. The database contains a set of parame-

ters for each person enrolled in the system. The recog-

nition is based on a comparison between the current

person and each person in the database. The compar-

ison is evaluated as a measurement of the geometrical

distance between one set of parameters and another

set of parameters. If the geometrical distance is be-

low a certain threshold, the person is recognized as

being enrolled previously (see ﬁgure 5).

2.4 The Combined System - An Active

Vision System

The dynamic tracking of a face utilising the PTU and

wide FOV stereo cameras provide the necessary ge-

ometrical information in order to bring the face into

the centre of the captured images from the high res-

olution cameras. The two systems, the active face

tracking system and the system for 3D reconstruction

and recognition of the tracked faces, are two indepen-

dent systems realized on two different computers. No

communication is exchanged during operation. The

active tracking system tracks faces within the wide

FOV whenever a face is present. The system for 3D

reconstruction and recognition of faces independently

detects faces within the FOV of the high resolution

cameras. Whenever a face is present and the images

are sufﬁciently stable, the system captures high reso-

lution images of the face, initiates the reconstruction

and search for a match in the database.

When a face is recognised, different actions can

take place depending on the application such as door

or building access, computer system access, etc.

3 ADVANTAGES OF ACTIVE

VISION

An active vision system provides numerous advan-

tages over a stationary system. One obvious advan-

tage is the increased workspace for e.g. face track-

ing as outlined in section 3.1. Moreover, the active

movement of the cameras bringing the face into the

centre of the captured images and by that the accu-

racy of the reconstruction of the faces is increased –

compared to a positioning of the face in the periphery

of the narrow FOV image – due to the geometric con-

straints discussed in section 3.2. Due to the increased

accuracy in the geometrical reconstruction, a more ac-

curate 3D-face model can be generated, thus a higher

recognition rate should be achieved. The following

subsections will describe these two advantages in de-

tail.

3.1 Extension of the Field of View

By utilising a PTU for the face tracking, the

workspace of the system signiﬁcantly increases. A

stationary vision system utilising only the two high

resolution cameras has a working space deﬁned by

the overlapping FOV of the high resolution cameras

where the face is sufﬁciently visible to both cameras.

By adding the PTU to the tracking of the face the sys-

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

470

Figure 5: Recognition by ﬁnding matching parameters in the database.

tem becomes an active system and the workspace in-

creases only to be limited by the speciﬁcations of the

PTU. The distance between the high resolution cam-

eras and the face does not increase when utilising an

active vision system of this type and hence remains

unchanged. The principle of the increased workspace

is seen in Fig. 6 a).

3.2 Reconstruction Quality in the Field

of View

The uncertainty in reconstruction of a 3D point can

be expressed by the trace of the covariance matrices,

i.e., the sum of the Eigen-values (see, e.g. (Pugeault

et al., 2008) or (Hartley and Zisserman, 2003)). In

Fig. 6 b) and c) the trace tr of the reconstructed posi-

tion’s covariance matrix at different locations in space

as indicated by the planes in 6 d) and e). These ﬁg-

ures show that the reconstructed position’s covariance

is affected by the distance from the primitive to the

cameras’ optical centres and ’peripheriness’. As can

be seen in Fig. 6 b), the trace tr(Λ

) increases with

z-distance but also when the x-coordinate approaches

the periphery of the visual ﬁeld. The increase of un-

certainty is even more obvious, when a cut through

the visual ﬁeld along the x-y plane parallel to the im-

age plane is done as shown in Fig. 6 c).

Transferring the theoretical degradation of un-

certainties of 3D point-recon-struction to the face-

tracking system, the reconstruction quality of the

tracked faces should decrease when the face is moved

away from the image centre. Examples of nine high

resolution image pairs, each with the face located in

different positions in the images, are shown in Fig. 7

a). The 3D reconstructed faces from the high reso-

lution image pairs are seen in b). The principle of

the connection between which position correspond to

which 3D reconstruction is marked in red. This prin-

ciple also applies for Fig. 7 c).

From ﬁgures 7 b) and c) it can be observed that the

best reconstruction of the faces is apparent when the

faces are located in the centre of the image. It is also

observed that the degradation of the reconstruction is

worst at the outward pointing side of the face when

the face is located in the periphery of the images. Se-

vere artefacts due to image clipping are observed in

the upper and the lower reconstructions. Moreover,

artefacts are more prominent at the outwards pointing

sides of the faces.

Around the 3D reconstructed faces artefacts from

the background are present. Reconstruction degrada-

tion is also observed in the outer parts of the faces

due to the rounding of the face-edges and beginning

occlusion.

4 RECOGNITION RESULTS

The above mentioned advantages of using an active

vision system compared to a stationary system are

twofold. First in terms of reconstruction quality and

accuracy and second in terms of a larger workspace.

The effect of the reconstruction advantages have been

shown qualitatively in Fig. 7 b) and c) and can also be

investigated by means of a more quantitative measure.

As a consequence of a more accurate reconstruction

of the tracked face by utilising the active vision sys-

tem, compared to a stationary system, a higher recog-

nition rate is present as will be outlined below. Here

we want to stress again that the focus of the paper is

not on face recognition but on demonstrating the ef-

fect of using active vision on face processing. Hence

what is important here is to demonstrate the improve-

ment of face recognition when using active vision

compared to not using active vision for which a data

base of relative moderate size is sufﬁcient.

In a stationary system 10 subjects have been posi-

tioned in three different positions with respect to the

high resolution cameras, hence the faces were located

in roughly three positions in the high resolution im-

ages which are used for the 3D reconstruction. In

comparison to this, the active vision system has been

utilised in a similar manner, only the faces will now be

positioned in the centre of the high resolution images

regardless of the physical position of the face with re-

spect to the high resolution cameras as an effect of

using the active vision system. A graphical interpre-

tation of the set-up and principles are shown in Fig. 8

Enhanced3DFaceProcessingusinganActiveVisionSystem

471

Figure 6: Figure a) show the extended workspace. The dependency of uncertainty in reconstruction concerning periphery

view is depicted in b) and distance in c). Fig. d) and e) indicate the connection between the uncertainty graphs in b) and c)

with respect to FOV of a camera. Figures b) and c) from (Pugeault et al., 2008).

Figure 7: In a) nine different examples of image pairs from the high resolution cameras with nine different locations of the

face within the images is seen. Figure b) and c) show two different examples the reconstruction degradation vs. face-position

represented by the face positions in a). The principle of the connection of the high resolution images pair and the correspondent

3D reconstruction image in b) is marked in red. The principle of the face position and correspondent recognition also apply

for c).

a) and b). Examples of outputs are seen in Fig. 8 c)

and d). No vertical corrections have been made by

the system, only horizontal correction repositioning

the face in the centre of the high resolution images.

The recognition results when utilising an active

vision system compared to using a stationary vision

system are shown in Table 1. The table show correct

recognition rates in each of the three positions with

and without usage of the PTU. From the upper row

of the table it is observed that the recognition rate is

higher, 85% compared to 69.1% and 61.6%, when-

ever the face is positioned in the centre of the high

resolution images. When utilising the active vision

system, the percentages for the left and right posi-

tions increases to levels matching the centre position.

The percentages are 88.0%, 86.0% and 91.0% for the

left, centre and right positions, respectfully. During

the testing no false positives were detected (false pos-

itives: giving access to a non-correct match).

5 CONCLUSIONS AND FUTURE

WORK

An active vision system in the context of face recog-

nition based on 3D information has been examined by

means of qualitative and quantitative measures. There

are two main advantages of such a system compared

to a stationary set-up. First, the workspace can be

VISAPP2014-InternationalConferenceonComputerVisionTheoryandApplications

472

Figure 8: Principle of the test set-up. In ﬁgure a) the stationary set-up is seen with the active vision system disabled. In ﬁgure

b) set-up with the active vision system enabled is seen. Figures c) and d) show outputs from the stationary system and from

the active system respectively, both with the face positioned on the left with respect to the cameras.

Table 1: Correct face-recognitions based on 10 recogni-

tion attempts on 10 subjects. The subjects was positioned

in three different locations with respect to the common

mid-line of the high-resolution cameras; Left, Center and

Right. No false-positives were detected (No falsely access

allowances).

Left Center Right

No PTU 69.1% 85.7% 61.6%

With PTU 88.0% 86.0% 91.0%

enlarged signiﬁcantly. Secondly, by means of theo-

retically geometrical constraints of a vision system,

the reconstruction accuracy and hence reconstruc-

tion quality can be increased signiﬁcantly since the

tracked face stays in the centre of the captured high

resolution images used for the reconstruction. We

showed this dependency through qualitative results

for the 3D reconstruction of faces as well as quanti-

tatively by means of improvement of face recognition

performance. We have given two application exam-

ples of our system, face recognition as well as emo-

tion recording and recognition as seen in ﬁgure 1.

REFERENCES

Cootes, T. F., Edwards, G. J., and Taylor, C. J. (1998). Ac-

tive Appearance Models. In Proc. European Confer-

ence on Computer Vision.

Cristinacce, D. and Cootes, T. F. (2006). Facial Feature

Detection and Tracking with Automatic Template Se-

lection. In Proceedings of the 7th International Con-

ference on Automatic Face and Gesture Recognition.

Darrell, T., Moghaddam, B., and Pentland, A. P. (1996).

Active Face Tracking and Pose Estimation in an Inter-

active Room. In Proceedings of the 1996 Conference

on Computer Vision and Pattern Recognition (CVPR

’96).

Douxchamps, D. and Campbell, N. (2008). Robust real-

time face tracking for the analysis of human be-

haviour. In Proceedings of the 4th international con-

ference on Machine learning for multimodal interac-

tion.

Hartley, R. and Zisserman, A. (2003). Multiple View Geom-

etry in Computer Vision. Cambridge University Press.

Jafri, R. and Arabnia, H. R. (2009). A Survey of Face

Recognition Techniques. Journal of Information Pro-

cessing Systems, 5(2).

Kri

zaj, J.,

Struc, V., and Dobri

sek, S. (2012). Robust 3D

Face Recognition. Elektrotehni

ski Vestnik, 79:1–2.

Larsen, R. F. (2011). Face ﬁnding, tracking and pose esti-

mation with a multi-resolution stereo camera system

mounted on a pan-tilt unit. Master’s thesis, University

of Southern Denmark.

Mata, F. J. S., Berretti, S., Bimbo, A. D., and Pala, P. (2007).

Using geodesic distances for 2D-3D and 3D-3D face

recognition. In 14th International Conference of Im-

age Analysis and Processing.

Pugeault, N., Kalkan, S., W

org

otter, F., Baseski, E., and

uger, N. (2008). Relations between reconstructed

3d entities. In VISAPP, pages 186–193.

Tistarelli, M. and Grosso, E. (2000). Active vision-based

face authentication. Image and Vision Computing,

18(4):299314.

Tolba, A. S., El-Baz, A. H., and El-Harby, A. A. (2006).

Face Recognition: A Literature Review. International

Journal of Information and Communication Engineer-

ing.

Utsumi, Y., Sommerlade, E., Bellotto, N., and Reid, I.

(2012). Cognitive Active Vision for Human Identiﬁ-

cation. In IEEE International Conference on Robotics

and Automation.

Viola, P. and Jones, M. J. (2004). Robust Real-Time Face

Detection. International Journal of Computer Vision,

57(2):137–154.

Enhanced3DFaceProcessingusinganActiveVisionSystem

473