Architecture of YOLOv8 is refined and optimized,
which increases performance on a variety of hardware
platforms from high-end GPUs to edge devices and
makes sure wider accessibility and applicability
(Xiao, 2023). YOLOv8 includes the latest advances
in machine learning and deep learning and uses
enhanced training techniques to improve model
ability from generalize training data to real-world
scenarios, which decreases overfitting and improve
detection performance on unseen data (Das, 2024).
YOLOv8 is more robust to different types of input
data differences (such as lighting, occlusions, and
objects of different scales), which makes more
reliable in the diverse and challenging environment
(Aboah, 2023). YOLOv8 usually supports much
wider object class, which makes more versatile and
useful in the different fields and applications
(Motwani, 2023). YOLOv8 is easily integrated with
existing software and platforms and has more
compatibility and support for various programming
languages and frameworks (Luo, 2024). Although
YOLOv8 represents the cutting edge of object
detection technology, each iteration of YOLOv8
focuses on balancing trade-offs between speed,
accuracy and computational requirements (Reis,
2023).
The main object of this research analysises a
transfer learning that bases on object detection
framework using the YOLOv8 model. The research
investigates the performance and parameter influence
of YOLOv8 when trained on custom datasets through
transfer learning methodologies. Firstly, the paper
introduces the VOC2012 dataset as the primary input
data. Subsequently, the YOLOv8 model is configured
as the foundational network for feature extraction and
object detection tasks. Additionally, transfer learning
techniques are applied to enhance the model's
generalizability. Thirdly, key parameters are adjusted
and compared for thorough analysis. Furthermore, the
YOLOv8 predictive performance is assessed and
juxtaposed with results obtained from the COCO
dataset. The experimental findings highlight the
robust performance of YOLOv8 on the Vision Object
Classes (VOC) 2012 dataset. This study offers
valuable insights and serves as a reference for
researchers in the field, shedding light on effective
strategies for object detection using transfer learning
approaches.
2 METHODOLOGIES
2.1 Dataset Description and
Preprocessing
The PASCAL VOC dataset (Pascal, 2012) is widely
used for object detection, segmentation, and
classification tasks, serving as a benchmark for
computer vision models. It comprises two main
challenges, VOC2007 and VOC2012, with
annotations including bounding boxes and class
labels for object detection and segmentation masks.
The VOC2012 subset, used in this study, contains
11,530 images with annotations for 27,450 objects
and 6,929 segmentations. Data preprocessing
involves resizing and normalization, with a Python
script converting annotations into the required format
for YOLOv8. This project employs the VOC dataset
to train and analyze YOLOv8's performance,
comparing it against the original COCO dataset.
2.2 Proposed Approach
The main purposes of this paper are to construct and
analysis target detection based on transfer learning by
using YOLOv8 model (as shown Figure 1). Using
transfer learning research YOLOv8 training on
performance of custom dataset and influence of
parameters. Specifically, firstly the VOC2012 dataset
is used to the data input for this paper. Secondly,
Constructing the YOLOv8 model is as the basic
network for feature extraction and target detection
tasks, at the same time, using transfer learning
techniques generalizes model performance further.
Thirdly, core parameters need to be adjusted and
comparatively analysed. Additionally, this paper
analysis performance of prediction based on
YOLOv8 model, and compares with the COCO
dataset.
Figure 1: The framework of the YOLOv8 model
(Photo/Picture credit: Original).
2.2.1 YOLOv8
YOLOv8 mainly uses the Ultralytics framework and
the core of feature shows in the Figure 2, YOLOv8
provides a new the model of SOTA that includes P5
640 and P6 1280 resolution of object detect network
and model of example segmentation by using
YOLACT. Factors of scaling provide models of