
Table 8: Comparison of the individual identification accu-
racy with and without the deviation module.
Method Accuracy
w/ DM (Ours) 0.633
w/o DM 0.830
4.5 Verification of Individual Features
Separation
To verify whether the deviation module effectively
separates individual features, we compare the indi-
vidual identification accuracy of the proposed method
(Figure 2) and the comparative method (Figure 7).
For individual identification, we use the Awake data
of 25 subjects who have data from two or more lec-
ture sessions. We randomly select one facial image
of each subject from the data of different lecture ses-
sions and use them as Gallery (registered data) and
Probe (test data), respectively. We compare the fea-
ture vector obtained by inputting a Probe into the es-
timation model with the feature vectors obtained by
inputting each subject’s Gallery into the model, and
estimate that the subject whose feature vector is most
similar to the Probe is the same person. However,
the feature vector is V
state
for the proposed method
and V
face
for the comparison method, and the simi-
larity of the feature vectors is obtained using cosine
similarity. Individual identification is performed for
each Probe and the percentage of correct recognition
is calculated. The trials were repeated 100 times and
the averages of the recognition accuracy are shown
in Table 8. Lower recognition accuracy values indi-
cate better performance, and the proposed method’s
lower accuracy confirms the deviation module effec-
tively separates individual features.
5 CONCLUSION
In facial expression recognition, individual facial fea-
ture differences and expression methods can nega-
tively affect recognition accuracy. This study pro-
poses a method using a deviation module to reduce
the impact of individual differences, especially for es-
timating ambiguous internal states, which are more
challenging than basic emotions. To handle subtle and
ambiguous expression changes, we also utilize mixup
for data augmentation. Evaluation on e-learning facial
images for drowsiness estimation showed that using
the deviation module improved accuracy, confirming
its effectiveness in handling individual differences.
Applying mixup further enhanced accuracy, with the
best results achieved when mixing state feature vec-
tors for the same individual.
REFERENCES
Cao, Q., Shen, L., Xie, W., Parkhi, O. M., and Zisserman,
A. (2018). Vggface2: A dataset for recognising faces
across pose and age. In 2018 13th IEEE International
Conference on Automatic Face and Gesture Recogni-
tion (FG 2018), pages 67–74.
Friesen, W. V. (1973). Cultural differences in facial expres-
sions in a social situation: An experimental test on the
concept of display rules.
Kim, J.-H., Kim, B.-G., Roy, P. P., and Jeong, D.-M.
(2019). Efficient facial expression recognition algo-
rithm based on hierarchical deep neural network struc-
ture. IEEE Access, 7:41273–41285.
Liu, X., Vijaya Kumar, B., Jia, P., and You, J. (2019). Hard
negative generation for identity-disentangled facial
expression recognition. Pattern Recogn., 88(C):1–12.
Meng, Z., Liu, P., Cai, J., Han, S., and Tong, Y. (2017).
Identity-aware convolutional neural network for fa-
cial expression recognition. In 2017 12th IEEE In-
ternational Conference on Automatic Face & Gesture
Recognition (FG 2017), pages 558–565.
Schroff, F., Kalenichenko, D., and Philbin, J. (2015).
Facenet: A unified embedding for face recognition
and clustering. In 2015 IEEE Conference on Com-
puter Vision and Pattern Recognition (CVPR), pages
815–823.
Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A. A.
(2017). Inception-v4, inception-resnet and the impact
of residual connections on learning. In Proceedings of
the Thirty-First AAAI Conference on Artificial Intelli-
gence, AAAI’17, page 4278–4284. AAAI Press.
Xie, S., Hu, H., and Chen, Y. (2021). Facial expression
recognition with two-branch disentangled generative
adversarial network. IEEE Transactions on Circuits
and Systems for Video Technology, 31(6):2359–2371.
Zhang, H., Cisse, M., Dauphin, Y. N., and Lopez-Paz, D.
(2018). mixup: Beyond empirical risk minimization.
In International Conference on Learning Representa-
tions.
Zhang, H., Jolfaei, A., and Alazab, M. (2019). A face
emotion recognition method using convolutional neu-
ral network and image edge computing. IEEE Access,
7:159081–159089.
Zhang, K., Zhang, Z., Li, Z., and Qiao, Y. (2016). Joint
face detection and alignment using multitask cascaded
convolutional networks. IEEE Signal Processing Let-
ters, 23(10):1499–1503.
Zhang, W., Ji, X., Chen, K., Ding, Y., and Fan, C. (2021).
Learning a facial expression embedding disentangled
from identity. In 2021 IEEE/CVF Conference on
Computer Vision and Pattern Recognition (CVPR),
pages 6755–6764.
VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications
918