When k = 3, there are three clusters, but according
to the cluster shape and training data set, there are four
cluster centers. They correspond to standing, slow
walking, jogging, sprinting and a small amount of
chaotic posture.
Because of the poor density of the red data group
and the differentiation of the cluster centers, it shows
that the farthest data points in the data group have less
similarity, and two or more data groups are mixed
together. If k = 4, a new clustering center will be
segmented from the red data group, so as to obtain
more characteristic information of the data set.
The clustering results are basically consistent with
the training posture information. The green clustering
center is the standing posture, the blue is the slow
walking posture, the red is the jogging posture, and the
black is the sprint posture. The cluster centers of
standing, slow walking and sprint are relatively
concentrated, and the calculation of bone points is
clear. However, slow walking posture changes
greatly, which is similar to the skeletal points of
standing, slow walking and sprinting, so the clustering
is scattered. However, this method is combined with
multi label training, so it has little effect on recognition
efficiency. The scattered black data points below the
black data group are a small amount of chaotic posture
in the data set, so it is difficult to divide the specific K
value, so the main purpose of sharing the black
clustering group is to test the clustering recognition
rate.
Figure 16: Hierarchical clustering average-linkage(k=4).
The training efficiency of this method can be
tested through the test of the data set samples. If it is
necessary to test the local optimum and data centroid
of the data set, it needs to be tested through the K-
means clustering algorithm. Hierarchical clustering
processes data by similarity distance, while K-means
clustering algorithm calculates centroid by locating
data points, which is not necessarily an actual data
point. Speed and efficiency are great advantage of K-
means clustering algorithm. This clustering algorithm
uses the optimized Q-C-Kmeans ([J / OL]. Computer
engineering and application:, 2021) algorithm to
improve the coupling relationship of related attributes
between data points through the second power
processing, which is suitable for the test of low
similarity of different skeletal points. In this way, even
in the test of initial center fuzzification and center
deviation, the algorithm can still improve the internal
cluster structure optimization and the accuracy of data
group classification, and get the preliminary clustering
of data.
For the non-independent identically distributed
data points, the K value of the station, walking and
running attitude tags is tested by the sample group.
The total data points are divided into k = 3 data groups,
the initial centroid is fuzzy, the data points can cluster
by themselves through the distance calculation,
constantly screen and calculate the new centroid and
tend to converge, and finally get the expected optimal
center of a clustering. After several iterations, the
fuzzy result of the initial center of the test cluster is
obtained (Fig. 4). From the clustering results, we can
find that the clustering centers of "standing" and
"walking" posture data are concentrated, while the
centers of "running" are relatively diffuse. The main
reason is that "running" posture includes many kinds
of postures such as jogging and sprinting, and the limb
data gap of bone point calculation is relatively large,
so the center is relatively fuzzy.
The test result of the fuzzy initial center of the
cluster is relatively consistent with the data set.
Combined with the result, the center deviation of the
"running" attitude is tested to test the particle
convergence of the "running" attitude data. Through
the distance calculation of outliers, the centroid is
obtained, and then iteratively divided into "run"
clustering data group. The results show that the center
of outliers is evenly divided, and the segmentation line
is generated iteratively in the convergence region of
centroid. The centroid calculation is not necessarily an
actual data point, and the overall trend is good, which
shows that the center deviation of the data set trained
by this method is small.