extract HR from facial video such as motion based
methods, color-based methods as well as approaches
that have employed thermal imaging techniques. To
the best of our knowledge, all of the previously
proposed approaches have employed the Viola-Jones
algorithm for face detection. We have deviated from
this approach and employed Faster RCNNs for face
detection. The Faster RCNN employed in the present
study was able to detect faces without requiring the full
frontal profile of the face, thus making it more robust.
Secondly, depending upon nature of the background,
the Viola-Jones algorithm may detect multiple ROIs,
which may lead to confusion. This is not the case with
our face detection algorithm, since it is independent of
the background in the video.
An important feature of our framework is the ability to
recover feature points which may have been lost during
extreme head rotations. This makes our model robust
to extreme motion artifacts and is able to measure HR
even the subject performs a complete rotation (360
degrees). Next, while some of these papers have
reduced the problem of head movements, all of them
have a degradation in performance in the presence of
illumination interferences. In our framework, we have
accounted for this artifact by using RLS adaptive
filtering methods and the local region-based active
contour method (LRBAC) to segment the background
and remove the noise signal in the video arising from
changes in illumination.
We also performed an experiment where we monitored
a subject while watching specific scenes of a horror
movie for a period of 5-10 minutes, and extract the HR
of the subject. The average HR of approximately every
20s is plotted and compared with the ground truth data.
Upon comparison with the polar H10 HR monitoring
sensor, we found that our framework achieved a mean
error percentage of 1.71%.
Moreover, we also implemented previously proposed
approaches on our database of 18 videos and found that
our framework outperformed the previous four
methods and attained a root mean-square error of
8.28% on the MAHNOB-HCI database.
One principle source of error might be the difference
in sampling rates of the HR sensor and our webcam.
While our webcam had a sampling rate close to 30 Hz,
the H10 polar HR sensor had a higher sampling rate of
256 Hz. Also, in case of extremely low illumination
where the face is not visible, it would be useful to
combine motion and region-based methods to better
solve motion and illumination interferences.
A direction for future research would be to focus on the
integration of motion as well as colour-based methods
to estimate HR. The complementary nature of these
methods would enable a more robust approach to
simultaneously tackle motion and illumination artifacts
in the video.
REFERENCES
Balakrishnan, G., Durand, F., and Guttag, J., 2013.
Detecting pulse from head motions in video. In:
Proceedings of the IEEE Computer Society Conference
on Computer Vision and Pattern Recognition. 3430–
3437.
Basri, R. and Jacobs, D.W., 2003. Lambertian reflectance
and linear subspaces. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 25 (2), 218–233.
Breuer, P. and Major, P., 1983. Central limit theorems for
non-linear functionals of Gaussian fields. Journal of
Multivariate Analysis, 13 (3), 425–441.
Cennini, G., Arguel, J., Akşit, K., and van Leest, A., 2010.
Heart rate monitoring via remote
photoplethysmography with motion artifacts reduction.
Optics Express, 18 (5), 4867.
Garbey, M., Sun, N., Merla, A., and Pavlidis, I., 2007.
Contact-free measurement of cardiac pulse based on the
analysis of thermal imagery. IEEE Transactions on
Biomedical Engineering, 54 (8), 1418–1426.
Kwon, S., Kim, H., and Park, K.S., 2012. Validation of
heart rate extraction using video imaging on a built-in
camera system of a smartphone. In: Proceedings of the
Annual International Conference of the IEEE
Engineering in Medicine and Biology Society, EMBS.
2174–2177.
Lam, A. and Kuno, Y., 2015. Robust heart rate
measurement from video using select random patches.
In: Proceedings of the IEEE International Conference
on Computer Vision. 3640–3648.
Lankton, S. and Tannenbaum, A., 2008. Localizing region-
based active contours. IEEE Transactions on Image
Processing, 17 (11), 2029–2039.
Li, X., Chen, J., Zhao, G., and Pietikäinen, M., 2014.
Remote heart rate measurement from face videos under
realistic situations. In: Proceedings of the IEEE
Computer Society Conference on Computer Vision and
Pattern Recognition. 4264–4271.
Mohd, M.N.H., Kashima, M., Sato, K., and Watanabe, M.,
2015. A non-invasive facial visual-infrared stereo
vision based measurement as an alternative for
physiological measurement. In: Lecture Notes in
Computer Science (including subseries Lecture Notes
in Artificial Intelligence and Lecture Notes in
Bioinformatics). 684–697.
Moreno, J., Ramos-Castro, J., Movellan, J., Parrado, E.,
Rodas, G., and Capdevila, L., 2015. Facial video-based
photoplethysmography to detect HRV at rest.
International Journal of Sports Medicine, 36 (6), 474–
480.
Parra, E.J., 2007. Human pigmentation variation: evolution,
genetic basis, and implications for public health.
American journal of physical anthropology.
ICAART 2019 - 11th International Conference on Agents and Artificial Intelligence
152