tion. IEEE Trans. Information Forensics and Security,
7(6):1707–1716.
Liao, S., Yi, D., Lei, Z., Qin, R., and Li, S. (2009). Het-
erogeneous face recognition from local structures of
normalized appearance. In Proc. International Con-
ference on Advances in Biometrics, pages 209–218.
Springer-Verlag.
Liu, Q., Tang, X., Jin, H., Lu, H., and Ma, S. (2005). A non-
linear approach for face sketch synthesis and recogni-
tion. In Proc. IEEE Computer Society Conference on
Computer Vision and Pattern Recognition, volume 1,
pages 1005–1010. IEEE.
Lowe, D. G. (1999). Object recognition from local scale-
invariant features. In Proc. IEEE International Con-
ference on Computer Vision, volume 2, pages 1150–
1157. IEEE.
Sharma, A. and Jacobs, D. W. (2011). Bypassing synthesis:
PLS for face recognition with pose, low-resolution
and sketch. In Proc. IEEE Conference on Com-
puter Vision and Pattern Recognition, pages 593–600.
IEEE.
Tang, X. and Wang, X. (2003). Face sketch synthesis and
recognition. In Proc. 9th IEEE International Confer-
ence on Computer Vision, pages 687–694. IEEE.
Tao, D., Li, X., Wu, X., and Maybank, S. J. (2007). Gen-
eral tensor discriminant analysis and gabor features
for gait recognition. IEEE Trans. Pattern Analysis and
Machine Intelligence, 29(10):1700–1715.
Tao, D., Li, X., Wu, X., and Maybank, S. J. (2009). Geo-
metric mean for subspace selection. IEEE Trans. Pat-
tern Analysis and Machine Intelligence, 31(2):260–
274.
Turk, M. A. and Pentland, A. P. (1991). Face recognition
using Eigenfaces. In Proc. IEEE Computer Society
Conference on Computer Vision and Pattern Recogni-
tion, pages 586–591. IEEE.
Wang, R., Shan, S., Chen, X., Dai, Q., and Gao, W.
(2012a). Manifold–manifold distance and its applica-
tion to face recognition with image sets. IEEE Trans.
Image Processing, 21(10):4466–4479.
Wang, S., Zhang, D., Liang, Y., and Pan, Q. (2012b). Semi-
coupled dictionary learning with applications to im-
age super-resolution and photo-sketch synthesis. In
Proc. IEEE Conference on Computer Vision and Pat-
tern Recognition, pages 2216–2223. IEEE.
Wang, X. and Tang, X. (2009). Face photo-sketch synthesis
and recognition. IEEE Trans. Pattern Analysis and
Machine Intelligence, 31(11):1955–1967.
Wong, Y., Chen, S., Mau, S., Sanderson, C., and Lovell,
B. C. (2011). Patch-based probabilistic image qual-
ity assessment for face selection and improved video-
based face recognition. In Proc. IEEE Computer
Society Conference on Computer Vision and Pattern
Recognition Workshops, pages 74–81. IEEE.
Yan, S., Xu, D., Zhang, B., Zhang, H.-J., Yang, Q., and Lin,
S. (2007). Graph embedding and extensions: a general
framework for dimensionality reduction. IEEE Trans.
Pattern Analysis and Machine Intelligence, 29(1):40–
51.
Yang, J. and Liu, C. (2007). Horizontal and vertical
2DPCA-based discriminant analysis for face verifica-
tion on a large-scale database. IEEE Trans. Informa-
tion Forensics and Security, 2(4):781–792.
Yang, M., Zhu, P., Van Gool, L., and Zhang, L. (2013).
Face recognition based on regularized nearest points
between image sets. In Proc. 10th IEEE International
Conference and Workshops on Automatic Face and
Gesture Recognition, pages 1–7. IEEE.
Yi, D., Liu, R., Chu, R., Lei, Z., and Li, S. Z. (2007). Face
matching between near infrared and visible light im-
ages. In Proc. International Conference on Advances
in Biometrics, pages 523–530. Springer-Verlag.
Zhang, W., Wang, X., and Tang, X. (2011). Coupled
information-theoretic encoding for face photo-sketch
recognition. In Proc. IEEE Conference on Com-
puter Vision and Pattern Recognition, pages 513–520.
IEEE.
APPENDIX
This appendix demonstrates the solution of two sub-
problems defined in Section 3.
As in Eq. (19), Y is given and {W, b} is to be ob-
tained by minimizing J
1
(W,b; X,Y). Since there are
no analytical solutions, we use the gradient descent
(GD) method to minimize the expression. We com-
pute the derivatives of J
1
with respect to W and b as
(taking W
S
and b
S
as example):
∂J
1
∂W
S
= −
2
N
S
X
S
(Y −W
T
S
X
S
− b
S
1
T
N
S
)
T
+ 2α
S
W
T
S
X
S
L
S
X
T
S
+ 2βW
T
S
∂J
1
∂b
S
= −
2
N
S
(Y −W
T
S
X
S
− b
S
1
T
N
S
)1
N
S
Thus, the matrices can be updated by the above
gradients until convergence.
W
S
= W
S
− γ
∂J
1
∂W
S
, b
S
= b
S
− γ
∂J
1
∂b
S
As in Eq. (20), {W,b} is given and Y is to be
obtained by minimizing J
2
(Y; X,W, b). Consider the
derivative of J
2
with respect to Y
∂J
2
∂Y
=
2
N
S
(Y −W
T
S
X
S
− b
S
1
T
N
S
)
+
2
N
V
(YU − W
T
V
X
V
− b
V
1
T
N
V
)U
T
Let it be zero and obtain
Y =
1
N
S
(W
T
S
X
S
+ b
S
1
T
N
S
) +
1
N
V
(W
T
V
X
V
+ b
V
1
T
N
V
)U
T
×
1
N
S
I +
1
N
V
UU
T
−1
VISAPP2015-InternationalConferenceonComputerVisionTheoryandApplications
20