rate, and so on. Within a week’s time sequence win-
dow, the point abnormality and periodic abnormality
of each indicator at a certain moment will affect the
failure judgment of a single network cell. At the same
time, different sets of network cells need to be com-
pared to detect anomalies that are different from other
sibling network cell collections.
Figure 1: Three kinds of 4G-LTE wireless network cell
anomalies: anomalous outliers, anomalous cycles, and
anomalous collections.
2 RELATED WORK
In wireless network cell anomaly detection, the exist-
ing single-dimensional anomaly diagnosis algorithm,
whether it is traditional machine learning such logis-
tic regression (Kleinbaum et al., 2002) or deep learn-
ing algorithms such TCN (Bai et al., 2018), these
algorithms firstly predict the index value at the fu-
ture moment, then set the threshold of the difference
between the predicted data and the real data to de-
cide whether it is abnormal. This method has some
limitations. On the one hand, it can only judge the
abnormal value of a single indicator. To determine
whether the network cell is abnormal according to the
single indicator, it also needs to rely on the voting
between the indicators or other manually formulated
combination rules. On the other hand, this method
can only detect point anomalies and partial periodic
anomalies, and cannot compare the wireless network
cell data set with other sibling sets. Therefore, the
single indicator anomaly detection algorithm is not
suitable for the scenario in this paper. This paper
needs to be modeled by combining statistical fea-
ture extraction and multidimensional anomaly diag-
nosis algorithm. Statistical feature extraction mainly
includes the construction of time series features and
set features. Multidimensional anomaly diagnosis al-
gorithms include supervised algorithms with labeled
data, such as SVM(George and Vidyapeetham, 2012),
ANN(Pradhan et al., 2012), and unsupervised algo-
rithms with unlabeled data, such as k-Means(Wazid
and Das, 2016). Generally, the results of supervised
algorithms are more reliable and accurate than unsu-
pervised algorithms. However, due to the amount of
abnormal data is much less than normal data, a larger
amount of data is required to train an effective su-
pervision model, which means that it will cost a lot
to label the data. Therefore, supervised anomaly de-
tection algorithms are actually not suitable for large-
scale multi-dimensional anomaly detection Scenes.
Although unsupervised anomaly detection algorithms
do not require labeling data and are more suitable for
massive data scenarios, multidimensional unsuper-
vised algorithms cannot select useful features, these
mixed useless features will reduce the accuracy of
unsupervised models. This paper designs a method
of coupling supervised and unsupervised algorithms
for training. We have obtained a small number of
4G-LTE wireless network cell annotation data. These
data come from multiple operation and maintenance
engineers, but we found that different operation and
maintenance engineers have different understandings
of the same data. They rely on their own operation
and maintenance experience, and it is difficult to unify
their opinions. Therefore, we believe that these an-
notation data not only contain reliable abnormal la-
bels, but may also contain noisy normal data (False
alarms), which is a low-quality annotation data. If a
model with high accuracy is obtained through super-
vised algorithm training with this data, then its gener-
alization performance on a large number of samples
is not excellent. We first analyze these low-quality
annotation data to find useful features, and then use
these useful features to train unsupervised algorithms.
The anomaly detection ability of the unsupervised
model is improved through the coupling training of
the unsupervised algorithm and the supervised algo-
rithm.
General anomaly diagnosis algorithms such as
anomaly detection based on measure density and
KNN (Angiulli and Pizzuti, 2002), Auto Encoder
based on neural network (Aggarwal, 2015), anomaly
detection based on projected distance and PCA (Shyu
et al., 2003), Isolation Forest (Aryal et al., 2014),
One Class SVM (Wang et al., 2004), KDE (Kim
and Scott, 2012), etc., cannot simultaneously find
abnormal outliers, abnormal cycles, and abnormal
collections. After comparing various algorithms,
we selected the four algorithms with the best ef-
fects for analysis and subsequent experiments. As
shown in Figure 2, it can be seen that KNN and
Research on Optimization of 4G-LTE Wireless Network Cells Anomaly Diagnosis Algorithm based on Multidimensional Time Series Data
49