network structure, the model is robust to noise by
optimizing the network structure, such as setting two
identical neural networks to guide each other,
learning the loss value of each other to avoid falling
into overfitting and increasing robustness (Han B,
Yao Q, Yu X. 2018); The third is the optimization
processing based on the loss function, which
constructs a loss function that is robust to label noise,
and reduces the influence of label noise through the
robustness of the loss function itself (Zhang Z,
Sabunc M. 2018; Wang Y, Ma X, Chen Z, et al.
2019). Among them, the optimization of network
structure and loss function is to increase the
robustness of the model. Since it is impossible to
judge whether the data used contains label noise
during modeling, the performance of the model
cannot be guaranteed. Therefore, optimization
processing based on training data is more common
(Zhang ZZ, Jiang GX & Wang JW, 2020).
Training data optimization processing can be
divided into two categories based on processing
methods, namely noise sample removal (Sluban, B.,
Gamberger, D. & Lavrač, N, 2014) and noise sample
relabeling (Y. Wei, C. Gong, S. Chen, T. Liu, J. Yang
& D. Tao, 2020). Considering the operational
efficiency, the method of sample removal is more
common than the method of sample relabeling
(Frénay B, Verleysen M, 2014). However, the
problem of excessive removal may occur in the
sample removal process, that is, the number of noise
samples removed is much larger than the original
noise samples. Therefore, when measuring noise
sample removal methods, in addition to considering
the proportion of clean samples after removal, it is
also necessary to consider the recall rate of clean
samples.
In the process of sample processing, the
classification method based on confidence is mostly
used (Chen QQ, Wang WJ, Jiang GX, 2019), but the
method based on confidence needs to obtain the result
after the model learning is completed, so the time
consumption is relatively large. At the same time, the
method based on confidence will lead to a higher
degree of correlation between the classification result
and the reliability of the training sample. The
traditional way of classification is mostly a single
fixed threshold to classify the sample (Chen QQ et al.
2019). However, this method is prone to the problem
of prediction deviation near the threshold. In this
regard, Zhang Zenghui et al. (2020) proposed a local
probability sampling method based on confidence,
but this division method uses a single interval for
threshold sampling, which is overly dependent on the
artificially set interval, and the performance under
different noise rates is quite different.
Taking model training as a node, the entire model
construction process can be divided into the following
three stages: data processing before model training,
network construction during model training, and
other optimization operations after model training.
Data processing mostly occurs in the first stage, that
is, before model training, and then put the processed
data into different models for training. Data
processing in the first stage means that the second and
third stages of the model building process cannot
touch the original data set. In particular, when the
removal method is used to process suspected noise
samples, the size of the data set will be relatively
reduced, and the size of the data set will have an
impact on the training of the model (Lei SH, Zhang
H, Wang K, et al. 2019). Therefore, training the
model after the data is preprocessed by the noise
filtering algorithm does not guarantee the
improvement of the model classification accuracy.
This paper proposes a training method for
weighted correction after filtering for data containing
label noise. The main contributions of the proposed
method are:
1. Propose a random threshold label noise filtering
algorithm in the double interval based on the
loss value, which improves the sample filtering
accuracy and recall rate while reducing time
loss.
2. Based on the filtered data, a weighted correction
training method is proposed. Through secondary
training, the weight of the correct sample and
the weak sample category is increased, thereby
improving the accuracy of the model.
3. Analyze the influence of noise ratio and model
depth on the proposed method based on
experiments, and provide reference data for
subsequent applications.
2 FILTERED WEIGHTED
CORRECTION TRAINING
METHOD
The processing flow of weighted correction with filter
data (WCF) is mainly divided into two parts, which
are based on the noise label filtering algorithm of the
double interval. The purpose is to process the original
data set and the weighted correction training method.
The purpose is to apply the filtered data to the
correction training of the model.