D., Szcz˛esna, A., and Amato, G. (2022b). Bus vio-
lence: An open benchmark for video violence detec-
tion on public transport. Sensors, 22(21).
Ciampi, L., Gennaro, C., Carrara, F., Falchi, F., Vairo, C.,
and Amato, G. (2022c). Multi-camera vehicle count-
ing using edge-AI. Expert Systems with Applications,
207:117929.
Ciampi, L., Messina, N., Falchi, F., Gennaro, C., and Am-
ato, G. (2020). Virtual to real adaptation of pedestrian
detectors. Sensors, 20(18):5250.
Ciampi, L., Santiago, C., Costeira, J., Gennaro, C., and
Amato, G. (2021). Domain adaptation for traffic den-
sity estimation. In Proceedings of the 16th Interna-
tional Joint Conference on Computer Vision, Imag-
ing and Computer Graphics Theory and Applications.
SCITEPRESS - Science and Technology Publications.
Dasiopoulou, S., Mezaris, V., Kompatsiaris, I., Papas-
tathis, V., and Strintzis, M. G. (2005). Knowledge-
assisted semantic video object detection. IEEE Trans-
actions on Circuits and Systems for Video Technology,
15(10):1210–1224.
Fabbri, M., Lanzi, F., Calderara, S., Palazzi, A., Vezzani, R.,
and Cucchiara, R. (2018). Learning to detect and track
visible and occluded body joints in a virtual world.
In Computer Vision – ECCV 2018, pages 450–466.
Springer International Publishing.
Ge, Z., Liu, S., Wang, F., Li, Z., and Sun, J. (2021).
YOLOX: exceeding YOLO series in 2021. arXiv
preprint arXiv:2107.08430.
He, K., Gkioxari, G., Dollár, P., and Girshick, R. B. (2017).
Mask R-CNN. In IEEE International Conference
on Computer Vision, ICCV 2017, pages 2980–2988.
IEEE Computer Society.
Li, C., Li, L., Jiang, H., Weng, K., Geng, Y., Li, L., Ke,
Z., Li, Q., Cheng, M., Nie, W., Li, Y., Zhang, B.,
Liang, Y., Zhou, L., Xu, X., Chu, X., Wei, X., and
Wei, X. (2022). Yolov6: A single-stage object de-
tection framework for industrial applications. CoRR,
abs/2209.02976.
Lin, T., Goyal, P., Girshick, R. B., He, K., and Dollár, P.
(2020). Focal loss for dense object detection. IEEE
Transactions on Pattern Analysis and Machine Intel-
ligence, 42(2):318–327.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ra-
manan, D., Dollár, P., and Zitnick, C. L. (2014). Mi-
crosoft COCO: Common objects in context. In Com-
puter Vision – ECCV 2014, pages 740–755. Springer.
P˛eszor, D., Staniszewski, M., and Wojciechowska, M.
(2016). Facial reconstruction on the basis of video
surveillance system for the purpose of suspect iden-
tification. In Nguyen, N. T., Trawi
´
nski, B., Fujita,
H., and Hong, T.-P., editors, Intelligent Information
and Database Systems, pages 467–476, Berlin, Hei-
delberg. Springer Berlin Heidelberg.
Qu, Z., yuan Gao, L., ye Wang, S., nan Yin, H., and ming
Yi, T. (2022). An improved yolov5 method for large
objects detection with multi-scale feature cross-layer
fusion network.
Redmon, J. (2013). Darknet: Open source neural networks
in c.
Redmon, J., Divvala, S. K., Girshick, R. B., and Farhadi, A.
(2016). You only look once: Unified, real-time object
detection. In 2016 IEEE Conference on Computer Vi-
sion and Pattern Recognition, CVPR 2016, Las Vegas,
NV, USA, June 27-30, 2016, pages 779–788. IEEE
Computer Society.
Redmon, J. and Farhadi, A. (2017). Yolo9000: Better,
faster, stronger. In 2017 IEEE Conference on Com-
puter Vision and Pattern Recognition (CVPR), pages
6517–6525.
Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental
improvement. arXiv preprint arXiv:1804.02767.
Ren, S., He, K., Girshick, R., and Sun, J. (2017). Faster
r-CNN: Towards real-time object detection with re-
gion proposal networks. IEEE Transactions on Pat-
tern Analysis and Machine Intelligence, 39(6):1137–
1149.
Richter, S. R., Vineet, V., Roth, S., and Koltun, V. (2016).
Playing for data: Ground truth from computer games.
In Computer Vision – ECCV 2016, pages 102–118.
Springer International Publishing.
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., and
Lopez, A. M. (2016). The SYNTHIA dataset: A
large collection of synthetic images for semantic seg-
mentation of urban scenes. In 2016 IEEE Conference
on Computer Vision and Pattern Recognition (CVPR).
IEEE.
Staniszewski, M., Foszner, P., Kostorz, K., Michalczuk,
A., Wereszczy
´
nski, K., Cogiel, M., Golba, D., Woj-
ciechowski, K., and Pola
´
nski, A. (2020). Application
of crowd simulations in the evaluation of tracking al-
gorithms. Sensors, 20(17):4960.
Staniszewski, M., Kloszczyk, M., Segen, J., Wereszczy
´
nski,
K., Drabik, A., and Kulbacki, M. (2016). Recent de-
velopments in tracking objects in a video sequence. In
Intelligent Information and Database Systems, pages
427–436. Springer Berlin Heidelberg.
Szcz˛esna, A., Foszner, P., Cygan, A., Bizo
´
n, B., Cogiel,
M., Golba, D., Ciampi, L., Messina, N., Macioszek,
E., and Staniszewski, M. (2023). Crowd simulation
(CrowdSim2) for tracking and object detection.
Wang, C., Bochkovskiy, A., and Liao, H. M. (2022).
Yolov7: Trainable bag-of-freebies sets new state-
of-the-art for real-time object detectors. CoRR,
abs/2207.02696.
Wereszczy
´
nski, K., Michalczuk, A., Foszner, P., Golba, D.,
Cogiel, M., and Staniszewski, M. (2021). ELSA:
Euler-lagrange skeletal animations - novel and fast
motion model applicable to VR/AR devices. In Com-
putational Science – ICCS 2021, pages 120–133.
Springer International Publishing.
Zhou, X., Wang, D., and Krähenbühl, P. (2019). Objects as
points. arXiv preprint arXiv:1904.07850.
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., and Dai, J. (2021).
Deformable DETR: deformable transformers for end-
to-end object detection. In 9th International Confer-
ence on Learning Representations, ICLR 2021. Open-
Review.net.
CrowdSim2: An Open Synthetic Benchmark for Object Detectors
683