
Proceedings of the IEEE international conference on
computer vision, pages 1501–1510.
Jin, T., Lin, Z., Zhu, S., Wang, W., and Hu, S. (2021). Multi-
person gaze-following with numerical coordinate re-
gression. In 2021 16th IEEE International Confer-
ence on Automatic Face and Gesture Recognition (FG
2021), pages 01–08. IEEE.
Kadish, D., Risi, S., and Løvlie, A. S. (2021). Improving
object detection in art images using only style trans-
fer. In 2021 international joint conference on neural
networks (IJCNN), pages 1–8. IEEE.
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C.,
Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C.,
Lo, W.-Y., et al. (2023). Segment anything. In Pro-
ceedings of the IEEE/CVF International Conference
on Computer Vision, pages 4015–4026.
Li, W., Zhang, Y., Sun, Y., Wang, W., Li, M., Zhang, W.,
and Lin, X. (2019). Approximate nearest neighbor
search on high dimensional data—experiments, anal-
yses, and improvement. IEEE Transactions on Knowl-
edge and Data Engineering, 32(8):1475–1488.
Lian, D., Yu, Z., and Gao, S. (2018). Believe it or not, we
know what you are looking at! In Asian Conference
on Computer Vision, pages 35–50. Springer.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P.,
Ramanan, D., Doll
´
ar, P., and Zitnick, C. L. (2014).
Microsoft coco: Common objects in context. In Com-
puter Vision–ECCV 2014: 13th European Confer-
ence, Zurich, Switzerland, September 6-12, 2014, Pro-
ceedings, Part V 13, pages 740–755. Springer.
Masclef, T., Scuturici, M., Bertin, B., Barrellon, V., Scu-
turici, V.-M., and Miguet, S. (2023). A deep learning
approach for painting retrieval based on genre simi-
larity. In International Conference on Image Analysis
and Processing, pages 270–281. Springer.
Milani, F. and Fraternali, P. (2021). A dataset and a convo-
lutional model for iconography classification in paint-
ings. Journal on Computing and Cultural Heritage
(JOCCH), 14(4):1–18.
Montelongo, M., Gonzalez, A., Morgenstern, F., Donahue,
S. P., and Groth, S. L. (2021). A virtual reality-based
automated perimeter, device, and pilot study. Transla-
tional Vision Science & Technology, 10(3):20–20.
Qi, D., Tan, W., Yao, Q., and Liu, J. (2022). Yolo5face:
Why reinventing a face detector. In European Confer-
ence on Computer Vision, pages 228–244. Springer.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G.,
Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark,
J., et al. (2021). Learning transferable visual models
from natural language supervision. In International
conference on machine learning, pages 8748–8763.
PMLR.
Recasens, A., Khosla, A., Vondrick, C., and Torralba, A.
(2015). Where are they looking? Advances in neural
information processing systems, 28.
Ren, S., He, K., Girshick, R., and Sun, J. (2016). Faster
r-cnn: Towards real-time object detection with re-
gion proposal networks. IEEE transactions on pattern
analysis and machine intelligence, 39(6):1137–1149.
Reshetnikov, A., Marinescu, M.-C., and Lopez, J. M.
(2022). Deart: Dataset of european art. In Euro-
pean Conference on Computer Vision, pages 218–233.
Springer.
Ridnik, T., Ben-Baruch, E., Noy, A., and Zelnik-Manor,
L. (2021). Imagenet-21k pretraining for the masses.
arXiv preprint arXiv:2104.10972.
Smirnov, S. and Eguizabal, A. (2018). Deep learning for
object detection in fine-art paintings. In 2018 Metrol-
ogy for Archaeology and Cultural Heritage (MetroAr-
chaeo), pages 45–49. IEEE.
Tan, W. S., Chin, W. Y., and Lim, K. Y. (2021). Content-
based image retrieval for painting style with convolu-
tional neural network. The Journal of The Institution
of Engineers Malaysia, 82(3).
Tian, S., Tu, H., He, L., Wu, Y. I., and Zheng, X. (2023).
Freegaze: A framework for 3d gaze estimation us-
ing appearance cues from a facial video. Sensors,
23(23):9604.
Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y. M. (2023).
Yolov7: Trainable bag-of-freebies sets new state-of-
the-art for real-time object detectors. In Proceedings
of the IEEE/CVF conference on computer vision and
pattern recognition, pages 7464–7475.
Westlake, N., Cai, H., and Hall, P. (2016). Detecting peo-
ple in artwork with cnns. In Computer Vision–ECCV
2016 Workshops: Amsterdam, The Netherlands, Oc-
tober 8-10 and 15-16, 2016, Proceedings, Part I 14,
pages 825–841. Springer.
Wilcoxon, F. (1992). Individual comparisons by ranking
methods. In Breakthroughs in statistics: Methodology
and distribution, pages 196–202. Springer.
Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., and Li, S. Z.
(2017). Faceboxes: A cpu real-time face detector with
high accuracy. In 2017 IEEE International Joint Con-
ference on Biometrics (IJCB), pages 1–9. IEEE.
Zhao, W., Jiang, W., Qiu, X., et al. (2022). Big transfer
learning for fine art classification. Computational In-
telligence and Neuroscience, 2022.
VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications
134