the femur is, and 32x can use the bounding box to
locate the fracture.
The accuracy score obtained from the fracture
detection performed by Rashid et al., using a 28-
layer dilated CNN and long short-term memory
(DCNN-LSTM) on 965 wrist X-ray images, is
88.24% (Rashid, 2023). The result of fracture
detection performed by Jia et al. on 1227 sternum
fracture X-ray images from the collection of sternal
radiographs and hospital diagnostic reports, 0.71
mAP, was obtained using the cascade R- CNN
method (Jia, 2022). The AP score of Guan et al. was
62.04% with a two-stage R-CNN method developed
for fracture detection based on nearly 4000 arm
fracture X-ray images using Resnet backbone.[Guan
B, 2020] Wang et al. carried out fracture detection
procedures(WrisNet), achieving a 56.6% score of
AP, using the model inspired by Faster-RCNN,
mainly composed of ResNeXt-TA and FPN for a
total of 4346 hairline fractures in hand X-rays
images.(WANG W.) Lu et al. developed automated
universal fractures detection in X-ray images using a
modified Ada-ResNeSt backbone network and the
AC-BiFPN detection method based on the part of
the MURA dataset. They achieved an AP score of
68.4% on 30000 X-ray images. (LU S, 2022) Guan
et al. achieved an AP score of 88.9% using a
balanced FPN-ResNeXt model developed for
fracture detection in a 3842 thighbone X-ray
radiographs dataset. (Guan B, 2022) Yadav et al.
used a deep learning model to detect and classify X-
ray images of human fracture bone and healthy bone.
5-fold cross-validation was implemented on 4000
augmented datasets and got 92.44 % accuracy for
the healthy and the fractured bone. (Xue L, 2021)
ParallelNet is proposed by Wang et al. for detection
tasks on thigh bone fracture based on multiple
backbone networks. The dataset contains 3842 X-
way radiographs; the result is 87.8% AP50 and
49.3% AP75. (WANG M, 2021) Chin et al.
proposed an Auxiliary Classifier Generative
Adversarial Network (AC-GAN) model to label the
position of the fracture. The result shows an
accuracy of 91.2% (Chiun-Li Chin, 2019).
2
ANOTHER SECTION OF YOUR
PAPER
YOLOv8 is the latest deep-learning model in the
YOLO series. The structure of this model is shown
in Figure 2. The detail of the model will be
explained in this section.
Downsample 4X
Downsample 16X
Downsample 8X
Downsample 32X
Downsample 16X
Down sample 8X
Downsample 16X
Downsample 32X
C2f & ConvModule
C2f & ConvModule
C2f & SPPF
Upsample & Concat
C2f & Upsample & Concat
C2f
2x ConvModule
Femur
Frac ture
C2f & ConvModule
Head1
ConvModule
Head2
Head 3
BACKBO NE BODY
Figure 2: The structure of YOLOv8 model.
2.1 Yolov8 Model
Like YOLOv5, YOLOv8 provides different size of
models based on the scaling factor to meet the needs
of different scenarios. The Backbone and Neck part
of the model refers to the YOLOv7 ELAN design
idea. The C3 structure from YOLOv5 has been
replaced with a richer C2f structure of gradient flow,
and the number of channels has been adjusted for
models of different scales. It is a fine-tuning of the
model structure. It is no longer a brainless set of
parameters to apply to all models, dramatically
improving the model's performance. Compared with
YOLOv5, the Head part has changed a lot. It has
been replaced with the current mainstream
decoupling head structure, which separates the
classification and detection heads. The
TaskAlignedAssigner positive sample allocation
strategy is adopted in the calculation, and the data
enhancement part of the Distribution Focal Loss
training is introduced. The Mosiac enhanced
operation can effectively improve the accuracy
introduced from YOLOX.
2.2 Study Dataset
The X-ray images of femur fracture include 312
fracture images collected from Shenzhen University
General Hospital between 2019 to 2023, ranging
from femoral neck fracture to distal femur fracture.
The physicians from Shenzhen University General
Hospital checked the data labeling of femur X-ray
images.
3
EXPERIMENTS
This study used local PC to train the deep-learning
model for femur fracture detection. The graphics
card of the local PC is 12 GB Nvidia GTX3060.
The following configuration is used in all
machine learning models for femur fracture
detection: the epoch of data training is 300 times, the