The rest of this paper is organized as follows. Sec-
tion 2 introduces some basic related work in the field
of local descriptor, Section 3 presents the LE descrip-
tor. Section 4 discusses the disadvantage of the LE de-
scriptor and presents the main details of our approach.
In Section 5, extensive experiments are conducted and
the experimental results are given. Finally, Section 6
concludes our work and figures out some possible fu-
ture work.
2 RELATED WORK
Several local appearance descriptors have been used
in face recognition task. Among all of them, LBP
and its extensions are the most widely used. Here,
we constrain our discussion within the LBP based ex-
tension descriptors. (Tan and Triggs, 2010) proposed
local ternary pattern(LTP), instead of directly compar-
ing the neighbour pixel’s value with the center value,
LTP compares the neighbour pixel’s value with the
center value with a certain threshold. By doing so, it
can remove some noise in the image, but the improve-
ment is very limited. (Ahonen and Pietikainen, 2007)
proposed High order derived LBP, which calculates
the LBP value based on the previous LBP coded im-
age, in literature, it showed that the third derivative
LBP has the best performance, but its performance is
not stable, i.e., in some databases, it has good perfor-
mance, in some databases, the performance improve-
ment is not so significant. (Liao and Lei, 2007) pro-
posed Multi Scale LBP(MS-LBP), in this approach,
they compare the neighbour blocks’ mean gray value
instead of compare the pixel level’s gray value, it has
been demonstrated better performance when used for
classification problem, as for face recognition, the
performance is worse than original LBP. (Wolf and
Taigman, 2008) proposed patch based LBP, specif-
ically, three patch LBP and four patch LBP, com-
pared with LBP, it’s efficient to compute but the per-
formance improvement is not so significant, in some
cases, even a little worse. (Marr and Hildreth, 2005)
proposed soft LBP, instead of assign each value a cer-
tain binary value, they assign each pixel with a range
of values, and each value with some probability, due
to the comparison is not direct between pixels, so this
method is not robust to illumination, and also, due to
for each pixel, it has different probability assigned to
it with certain values, it’s computation complicated.
(Zhang and Zhang, 2005) and (Xie and Gao, 2008)
proposed LBP on Gabor magnitude images, it showed
better performance and has been widely used, but to
calculate Gabor image, we need to convolve the im-
age with Gabor filters, which is expensive, so this
method is not suitable for the face recognition which
requires high speed. (Solar and Correa, 2009) pro-
posed Decision Tree based LBP, by combining the
LBP with the supervised learning, it has showed good
performance, but the performance is not stable, also,
the feature size is extremely large to put it into practi-
cal usage.
3 LEARNING BASED
DESCRIPTOR
Learning based descriptor is a novel feature extrac-
tion method for face recognition. Compared with Lo-
cal Binary Pattern, it corporates the feature extraction
with the unsupervised learning method, and it’s more
robust to pose and facial expression variation. The
specific steps of LE descriptor can be briefly summa-
rized as follows:
• DoG Filter
To remove the noise and illumination difference
in the image, each face image is feed to the DoG
filter(Hartigan and Wong, 1979) first. The con-
tinued face image processing are based on the fil-
tered image.
• Sampling and Normalization
At each pixel, sample its neighboring pixels in the
ring based pattern to form a low level feature vec-
tor. We sample r×8 pixels at even intervals on
the ring of radius r. In each sampling pattern, it
has three different parameters, i.e., ring number,
ring radius, sampling number of each ring. After
sampling, we normalize the sampled feature vec-
tor into unit length with L1 norm.
• Learning based Encoding
An encoding method is applied to encode the nor-
malized feature vector into discrete codes. Unlike
many handcrafted encoders, in LE approach, the
encoder is specifically trained for the face in an
unsupervised manner from a set of training face
images. In the paper, they recommend three unsu-
pervised learning methods: K-means(McNames,
2001), PCA tree(Dasgupta and Freund, 2007),
and random projection tree(Bentley, 1975). Af-
ter the encoding, the input image is turned into a
coded image.
• Histogram Representation
After the image has been encoded into coded im-
age, following the method described in Ahone et
als work, the encoded image can be divided into
a grid of patches. A histogram of the LE codes is
computed in each patch and the patch histogram is
SpeedUpLearningbasedDescriptorforFaceVerification
763