ACKNOWLEDGEMENTS
The author would like to show his gratitude to Pro-
fessor Dietmar Saupe, Hanhe Lin, Vlad Hosu, Franz
Hahn, Hui Men, and Mohsen Jenadeleh for shar-
ing their knowledge in visual quality assessment and
helping to use KoNViD-1k. The author would like to
thank the anonymous reviewers for their helpful and
constructive comments that greatly contributed to im-
proving the final version of the paper.
REFERENCES
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A.,
Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard,
M., et al. (2016). Tensorflow: A system for large-
scale machine learning. In 12th {USENIX} Sympo-
sium on Operating Systems Design and Implementa-
tion ({OSDI} 16), pages 265–283.
Ahn, S. and Lee, S. (2018a). Deep blind video quality as-
sessment based on temporal human perception. In
2018 25th IEEE International Conference on Image
Processing (ICIP), pages 619–623. IEEE.
Ahn, S. and Lee, S. (2018b). No-reference video quality as-
sessment based on convolutional neural network and
human temporal behavior. In 2018 Asia-Pacific Signal
and Information Processing Association Annual Sum-
mit and Conference (APSIPA ASC), pages 1513–1517.
IEEE.
Bianco, S., Celona, L., Napoletano, P., and Schettini, R.
(2018). On the use of deep learning for blind image
quality assessment. Signal, Image and Video Process-
ing, 12(2):355–362.
Borer, S. (2010). A model of jerkiness for temporal impair-
ments in video transmission. In 2010 second interna-
tional workshop on quality of multimedia experience
(QoMEX), pages 218–223. IEEE.
Chollet, F. et al. (2015). Keras. https://keras.io.
Dendi, S. V. R., Krishnappa, G., and Channappayya, S. S.
(2019). Full-reference video quality assessment using
deep 3d convolutional neural networks. In 2019 Na-
tional Conference on Communications (NCC), pages
1–5. IEEE.
Drucker, H., Burges, C. J., Kaufman, L., Smola, A. J.,
and Vapnik, V. (1997). Support vector regression ma-
chines. In Advances in neural information processing
systems, pages 155–161.
Giannopoulos, M., Tsagkatakis, G., Blasi, S., Toutounchi,
F., Mouchtaris, A., Tsakalides, P., Mrak, M., and
Izquierdo, E. (2018). Convolutional neural net-
works for video quality assessment. arXiv preprint
arXiv:1809.10117.
Habibzadeh, M., Jannesari, M., Rezaei, Z., Baharvand, H.,
and Totonchi, M. (2018). Automatic white blood cell
classification using pre-trained deep learning models:
Resnet and inception. In Tenth International Confer-
ence on Machine Vision (ICMV 2017), volume 10696,
page 1069612. International Society for Optics and
Photonics.
Hosu, V., Hahn, F., Jenadeleh, M., Lin, H., Men, H.,
Szir
´
anyi, T., Li, S., and Saupe, D. (2017). The kon-
stanz natural video database (konvid-1k). In 2017
Ninth international conference on quality of multime-
dia experience (QoMEX), pages 1–6. IEEE.
Ji, S., Xu, W., Yang, M., and Yu, K. (2012). 3d convolu-
tional neural networks for human action recognition.
IEEE transactions on pattern analysis and machine
intelligence, 35(1):221–231.
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Suk-
thankar, R., and Fei-Fei, L. (2014). Large-scale video
classification with convolutional neural networks. In
Proceedings of the IEEE conference on Computer Vi-
sion and Pattern Recognition, pages 1725–1732.
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
Korhonen, J. (2019). Two-level approach for no-reference
consumer video quality assessment. IEEE Transac-
tions on Image Processing, 28(12):5923–5938.
Li, D., Jiang, T., and Jiang, M. (2019). Quality assessment
of in-the-wild videos. In In Proceedings of the 27th
ACM International Conference on Multimedia. ACM.
Liu, W., Duanmu, Z., and Wang, Z. (2018). End-to-end
blind quality assessment of compressed videos using
deep neural networks. In ACM Multimedia, pages
546–554.
Lu, Z., Jiang, X., and Kot, A. (2018). Deep coupled resnet
for low-resolution face recognition. IEEE Signal Pro-
cessing Letters, 25(4):526–530.
Men, H., Lin, H., and Saupe, D. (2017). Empirical evalu-
ation of no-reference vqa methods on a natural video
quality database. In 2017 Ninth international confer-
ence on quality of multimedia experience (QoMEX),
pages 1–3. IEEE.
Men, H., Lin, H., and Saupe, D. (2018). Spatiotemporal fea-
ture combination model for no-reference video quality
assessment. In 2018 Tenth international conference
on quality of multimedia experience (QoMEX), pages
1–3. IEEE.
Mittal, A., Moorthy, A. K., and Bovik, A. C. (2012a).
No-reference image quality assessment in the spatial
domain. IEEE Transactions on image processing,
21(12):4695–4708.
Mittal, A., Saad, M. A., and Bovik, A. C. (2015). A com-
pletely blind video integrity oracle. IEEE Transac-
tions on Image Processing, 25(1):289–300.
Mittal, A., Soundararajan, R., and Bovik, A. C. (2012b).
Making a “completely blind” image quality analyzer.
IEEE Signal Processing Letters, 20(3):209–212.
Pastrana-Vidal, R. R., Gicquel, J. C., Colomes, C., and
Cherifi, H. (2004). Sporadic frame dropping impact
on quality perception. In Human Vision and Elec-
tronic Imaging IX, volume 5292, pages 182–193. In-
ternational Society for Optics and Photonics.
Reinhardt, R. (2010). Video with Adobe Flash CS4 Profes-
sional Studio Techniques. Adobe Press.
VISAPP 2020 - 15th International Conference on Computer Vision Theory and Applications
346