part model. In 2008 IEEE conference on computer
vision and pattern recognition, pages 1–8. Ieee.
Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE
international conference on computer vision, pages
Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014).
Rich feature hierarchies for accurate object detec-
tion and semantic segmentation. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 580–587.
Gochoo, M., Otgonbold, M.-E., Ganbold, E., Hsieh, J.-W.,
Chang, M.-C., Chen, P.-Y., Dorj, B., Al Jassmi, H.,
Batnasan, G., Alnajjar, F., Abduljabbar, M., and Lin,
F.-P. (2023). Fisheye8k: A benchmark and dataset for
fisheye camera object detection. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR) Workshops.
Gou, J., Yu, B., Maybank, S. J., and Tao, D. (2021). Knowl-
edge distillation: A survey. International Journal of
Computer Vision, 129:1789–1819.
He, K., Chen, X., Xie, S., Li, Y., Doll
ar, P., and Girshick,
R. (2022). Masked autoencoders are scalable vision
learners. In Proceedings of the IEEE/CVF conference
on computer vision and pattern recognition, pages
Jia, X., Tong, Y., Qiao, H., Li, M., Tong, J., and Liang,
B. (2023). Fast and accurate object detector for au-
tonomous driving based on improved yolov5. Scien-
tific reports, 13(1):1–13.
Jocher, G., Chaurasia, A., Qiu, J., and Ultralytics (2023).
Ultralytics yolov8: State-of-the-art model for real-
time object detection, segmentation, and classifica-
tion. https://github.com/ultralytics/ultralytics. Ac-
cessed: 2023-08-28.
Jocher, G., Chaurasia, A., Stoken, A., Borovec, J., Kwon,
Y., Michael, K., Fang, J., Yifu, Z., Wong, C., Montes,
D., et al. (2022). ultralytics/yolov5: v7. 0-yolov5 sota
realtime instance segmentation. Zenodo.
Ju, R.-Y. and Cai, W. (2023). Fracture detection in pe-
diatric wrist trauma x-ray images using yolov8 algo-
rithm. arXiv preprint arXiv:2304.05071.
Kannala, J. and Brandt, S. S. (2006). A generic camera
model and calibration method for conventional, wide-
angle, and fish-eye lenses. IEEE transactions on pat-
tern analysis and machine intelligence, 28(8):1335–
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C.,
Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C.,
Lo, W.-Y., et al. (2023). Segment anything. arXiv
preprint arXiv:2304.02643.
Kolbeinsson, B. and Mikolajczyk, K. (2023). DDOS: The
drone depth and obstacle segmentation dataset. arXiv
preprint arXiv:2312.12494.
Kong, T., Sun, F., Liu, H., Jiang, Y., Li, L., and Shi,
J. (2020). Foveabox: Beyound anchor-based object
detection. IEEE Transactions on Image Processing,
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. Advances in neural information processing
systems, 25.
Law, H. and Deng, J. (2018). Cornernet: Detecting objects
as paired keypoints. In Proceedings of the European
conference on computer vision (ECCV), pages 734–
Li, T., Tong, G., Tang, H., Li, B., and Chen, B. (2020).
Fisheyedet: A self-study and contour-based object
detector in fisheye images. IEEE Access, 8:71739–
Li, Y., Mao, H., Girshick, R., and He, K. (2022). Exploring
plain vision transformer backbones for object detec-
tion. pages 280–296.
Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., and Sun,
J. (2017). Light-head r-cnn: In defense of two-stage
object detector. arXiv preprint arXiv:1711.07264.
Lin, T.-Y., Doll
ar, P., Girshick, R., He, K., Hariharan, B.,
and Belongie, S. (2017a). Feature pyramid networks
for object detection. In Proceedings of the IEEE con-
ference on computer vision and pattern recognition,
pages 2117–2125.
Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll
ar, P.
(2017b). Focal loss for dense object detection. In
Proceedings of the IEEE international conference on
computer vision, pages 2980–2988.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P.,
Ramanan, D., Doll
ar, P., and Zitnick, C. L. (2014).
Microsoft coco: Common objects in context. In Com-
puter Vision–ECCV 2014: 13th European Confer-
ence, Zurich, Switzerland, September 6-12, 2014, Pro-
ceedings, Part V 13, pages 740–755. Springer.
Liu, H., Duan, X., Lou, H., Gu, J., Chen, H., and Bi,
L. (2023). Improved gbs-yolov5 algorithm based on
yolov5 applied to uav intelligent traffic. Scientific Re-
ports, 13(1):9577.
Liu, S., Qi, L., Qin, H., Shi, J., and Jia, J. (2018). Path ag-
gregation network for instance segmentation. In Pro-
ceedings of the IEEE conference on computer vision
and pattern recognition, pages 8759–8768.
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.,
Fu, C.-Y., and Berg, A. C. (2016). Ssd: Single shot
multibox detector. In Computer Vision–ECCV 2016:
14th European Conference, Amsterdam, The Nether-
lands, October 11–14, 2016, Proceedings, Part I 14,
pages 21–37. Springer.
Lyu, Y., Vosselman, G., Xia, G.-S., Yilmaz, A., and Yang,
M. Y. (2020). Uavid: A semantic segmentation dataset
for uav imagery. ISPRS Journal of Photogrammetry
and Remote Sensing, 165:108 – 119.
Purkait, P., Zhao, C., and Zach, C. (2017). Spp-net: Deep
absolute pose regression with synthetic views. arXiv
preprint arXiv:1712.03452.
Radford, A., Kim, J. W., Hallacy, C., Ramesh, A., Goh, G.,
Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark,
J., et al. (2021). Learning transferable visual models
from natural language supervision. pages 8748–8763.
Redmon, J., Divvala, S., Girshick, R., and Farhadi, A.
(2016). You only look once: Unified, real-time object
detection. In Proceedings of the IEEE conference on
ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods