them. The robust PCA (RPCA) (Wright et al., 2009)
is a powerful tool to get rid off such errors and retrieve
cleaner images potentially better suited for computer
vision application, namely face recognition.
In this paper, we propose an incremental method
for robust face recognition under various conditions
based on RPCA. The proposed method handles both
misalignment and occlusion problems on face images.
In order to improve the recognition process and based
on RPCA (Wright et al., 2009), we eliminate cor-
ruptions and occlusion in original face images. Be-
sides, the incremental aspect of our face recognition
method handles the memory constraint and computa-
tional cost of a large data set. To measure the similar-
ity between a query image and a sequence of images
of one person, we define a new similarity metric. To
evaluate the performance of our method, experiments
on the AR (Martinez and Benavente, 1998), ORL
(Samaria and Harter, 1994), PIE (Sim et al., 2002),
YALE (Belhumeur et al., 1997) and FERET (Phillips
et al., 1998) databases and a comparison with other
incremental PCA methods namely incremental singu-
lar value decomposition (SVD) (Hall et al., 2002), add
block SVD (Brand, 2006) and candid covariance-free
incremental PCA (Weng et al., 2003) are conducted.
We also compare our method to a face recognition
method based on batch robust PCA, denoted by face
recognition RPCA (FRPCA) (Wang and Xie, 2010).
This paper is organized as follows. In Section 2,
we introduce the RPCA method, incremental RPCA
(IRPCA) and our face Recognition method, denoted
by new incremental RPCA (NIRPCA). Finally, in
Section 3, we present our experimental results.
2 FACE RECOGNITION BASED
ON IRPCA
2.1 Robust Image Alignment by Sparse
and Low-rank Decomposition
Peng et al., (Peng et al., 2010) proposed robust align-
ment by sparse and low-rank decomposition for lin-
early correlated images (RASL). It is a scalable op-
timization technique for batch linearly correlated im-
age alignment. One of its objectives is to robustly
align a dataset of human faces based on the fact that
if faces are well-aligned, they show efficient low-rank
structure up to some sparse corruptions. Even per-
fectly aligned images may not be identical, but at least
they lie near a low-dimensional subspace (Basri and
Jacobs, 2003). To the best of our knowledge, RASL
is the first method that uses a trade-off between rank
minimization and alignment of image data. Hence,
the idea is to search for a set of transformations τ such
that the rank of the transformed images becomes as
small as possible and at the same time the sparse er-
rors are compensated. Generally, the applied transfor-
mation is the 2D affine transform, where we implicitly
assume that the face of a person is approximately on
a plane in 3D-space.
2.2 Incremental Robust Principal
Component Analysis (IRPCA)
RPCA algorithm is aimed to recover the low-rank ma-
trix A from the corrupted observations D = A + E,
where corrupted entries E are unknown and the er-
rors can be arbitrarily large but assumed to be sparse.
More specifically, in face recognition, E is a sparse
matrix because it is assumed that only a small fraction
of image pixels are corrupted by large errors (e.g., oc-
clusions). Hence, being able to correctly identify and
recover the low structure A could be very interesting
for many computer vision applications namely face
recognition.
We assume that we have m subjects and each one
has n face images. Although RASL can give a very
accurate alignment for faces (Peng et al., 2010), it
is not applicable when the total number of images
m × n denoted by l is very large. Wu et al., (Wu
et al., 2011) proposed an extension to RASL from l
to L where L >> l, by reformulating the problem us-
ing a ”one-by-one” alignment approach. This incre-
mental alignment can be summarized in three steps.
First, l frames are selected to be aligned with batch
RASL method producing a low-rank summary A
∗
. In
the second step, the (l + 1)
th
image is aligned with
A
∗
which contains the information of the previously
aligned l images. Finally, the second step is repeated
for the remaining images, regardless of the size of the
data set.
We denote by I
j
i
, A
j
i
, E
j
i
the corrupted ,observed,
face image, the original face image and the error of
the j
th
image of the i
th
subject, respectively. Then,
we have I
j
i
= A
j
i
+E
j
i
, where i denotes the subject and
j its corresponding image such that i = 1, . . . , m, j =
1, . . . , n. Let:
vec : R
w×h
→ R
(w×h)×1
, (1)
be a function which transforms a w × h image matrix
into a (w × h) × 1 vector by stacking its columns to
have vec(I
j
i
) = vec(A
j
i
) + vec(E
j
i
). Assuming that we
have m subjects and each one has n images, we define
for the i
th
subject:
D
i
:= [vec(I
1
i
)|. . . |vec(I
n
i
)] = A
i
+ E
i
(2)
VISAPP2013-InternationalConferenceonComputerVisionTheoryandApplications
550