2.The rate of convergence is slow. For the BP
neural network, the gradient descent method it uses is
highly inefficient. For this experiment, when the
learning rate is 0.01, the number of iterations
generally needs to be set to more than 1500 to
guarantee accuracy. However, if the training rate is
modified to reduce the number of iterations, the
experimental results will be poor, mainly because of
missing the global optimum. Slow convergence rate
leads the computing to be time-consuming. During
cross-validation, for each hyperparameter, the
calculation time is about 2 minutes and 32 seconds
when the number of cycles is set to 1500, which
significantly slows down the research progress.
3.The number of hidden layers is limited.
Considering the Computational difficulty of the
complex neural network, the number of hidden layers
set by the BP neural network used in this research is
1. Setting more hidden layers can significantly
improve the model's predictive ability, but the
computational time will also increase significantly. In
future research, if equipment and time conditions
allow, more hidden layers can be added to improve
model performance and accuracy.
Because of the original data set format, this
experiment failed to obtain more accurate position
information of the football player and can only
predict which of the three positions (the attacker, the
midfielder, the defender) the player is suitable.
However, with the evolution of football, the division
of positions on the game field is more detailed, and
there are already eleven different sub-positions in
attackers, for instance, shadow strikers. Therefore,
the prediction model of this study is more suitable for
roughly judging which position a football player is
eligible to play and assisting players in developing
training plans and cannot perform more detailed
player classification.
The results show that the model’s overall ability
to predict which position the player is suitable for is
high, reaching 77%. This is because the ANOVA test
was conducted before the indicators being imported
into the machine learning model; thus, the indicators
differ significantly between different positions. On
the other hand, because the BP neural network has the
better predictive ability, the weight and threshold are
continuously updated through the gradient descent
method so that the accuracy curve converges and
stabilizes in the desired value. However, the
prediction ability of the model for different positions
is quite divergent. For example, although the
prediction precision for the defender and midfield
players is as high as 77% and 90%, the precision rate
for the attacker is only 40%. This is mainly due to the
varying data size between positions. For example, for
the testing set, the number of midfielders and
defenders accounted for 60% and 30%, respectively,
but attackers accounted for only 10%. This imbalance
of proportions will lead to more significant
differences in the final training results.
Moreover, the overall data size is also an
important issue. Although the original data in this
study include the five football leagues, including
Premier League (England), La Liga (Spain),
Bundesliga (Germany), Serie A (Italy), and Ligue 1
(France), because each piece of data is a football
player and its corresponding indicators, the overall
data size is not large. The total number of football
players in the five major leagues is 3603, and because
some players lack relevant indicators, the number of
football players finally brought into the model is only
891. This data size is relatively small for the machine
learning model.
From the analysis results of various indicators, the
attacker has the highest shooting accuracy, followed
by midfielders and defenders. Defenders outperform
other positions according to other indicators,
including accuracy of the simple pass, the accuracy of
glb, the accuracy of defending duel, the accuracy of
attacking duel, and the accuracy of air duel, followed
by midfielders and attackers. Because it is necessary
to frequently participate in the team's offense and
create shooting opportunities, this position in the
front field must be higher than other positions for the
players' shooting skills. The accuracy of glb, the
accuracy of defending duel, and the accuracy of
attacking duel can all reflect the player's defensive
ability. The defender is the last defensive line for the
opponents except for the goalkeeper, so it must have
a higher ability for dueling and interception ("Luiz
Adriano”, 2013).
5 CONCLUSION
When the learning rate is set to 0.125, and the number
of hidden layer neurons is 6, the model's accuracy rate
converges when the number of iterations reaches 300
or more, and the final accuracy can reach 77%. The
prediction accuracy rate for midfielders and
defenders was 77% and 90.0%, but the prediction
accuracy rate for attackers was only 40%. This shows
that the model has higher accuracy in predicting
which position a player is suitable for, but the
predictive ability of different positions is quite
different. The prediction precision of attackers is low,
which may be due to the small data size. Future