or strategy (Cao, 2019), while paid less attention to
how players’ passing ability impact the whole game.
As an important part of players’ ability, it is
difficult to evaluate players’ passing ability only by
short pass, long pass, or other single indicators.
Therefore, the purpose of this paper is to build a
passing evaluation system by calculating indicators of
passing networks and use a gradient boosting decision
tree to find the importance of these indicators,
providing a reference for evaluating players’ value.
Since the end of the last century, many scholars
have focused on the core passing indicators of players:
how they influence the game and how to quantify them.
Clemente et.al believed that social network indicators
reflect the characteristics of a player’s role to some
extent, proving that side defenders and central
defenders lead the attack in most cases (Clemente,
2014). Grund et.al used algorithm such as Hierarchical
Linear Model and Poisson regression modeling to
analyze the tactical characteristics of the passing
network based on 76 indicators such as centrality and
running distance, etc., believing that the passing
network with a high density and a low centrality have
higher possibility to win (Thomas, 2015).
The studies above have shown that social network
indicators reflect the tactical characteristics of players
to a certain degree. Therefore, how to use social
network indicators as input of an algorithm to obtain
better prediction results, many previous studies can
be used as references.
The research of Power divided whether a player
scores a goal or not into two parts: the probability of
pass and chance of score and designed a formula
considering these two parts, so that obtained the total
score of each player and describe their behaviors
through supervised learning (Power et al., 2017).
Brooks et.al put forward the concept of “pass shot
value”, using feature matrix to quantify the probability
of passing and weight of every feature to decide the
importance of features, then predict through the
support vector machine (Brooks et al., 2016).
It can be seen that evaluating a player’s ability
through pass and goal is already a relatively common
research method. At the same time, many scholars
have focused on the establishment of an evaluation
system, which is building a relationship between
features through algorithms and calculating the score
of players based on the features, so that creating a
quantitative criterion for evaluating players or teams.
Based on data of 2014/15 to 2017/18 seasons of
the Premier League and Bundesliga, Bransen et.al
split a series of actions and calculated the impact of
each action on the whole game, which serves as a new
method to evaluate the value of player (M et al., 2019).
In 2019, Pappalardo et.al proposed to use support
vector machines to calculate the weight of features of
each player and promote it to each team, so that
created a comprehensive ranking system
(Pappalardo
et al., 2019). The article by Yuesen Li et.al proposed
to use a linear support vector machine to calculate the
weight of features. The result proves that the
calculated player ranking is consistent with the actual
ranking in the Chinese Super League (Li et al., 2020).
2 METOHDS
2.1 Data Source
Provided by wyscout, openly shared on figshare, the
dataset of football-logs consists of 1,941 matches,
3,251,294 events, and 4299 players 7 prominent
competitions around the world: La Liga (Spain,
2017/18), Premier League (England, 2017/18), Serie
A (Italy, 2017/18), Bundesliga (Germany, 2017/18),
Ligue 1 (France, 2017/18), FIFA World Cup 2018
and UEFA Euro Cup 2016. As there are significant
distinctions among different leagues, this paper
focuses on data of the Premier League (England,
2017/18) (Pappalardo, 2019).
This dataset is collected, labeled manually by
professional performance analysts, recording every
behavior of every player during the game which is
divided into 5 documents in JSON format:
Competitions, Matches, Events, Players and Teams.
2.2 Passing Network Analysis
Data of every game can be defined as an adjacency
matrix W to describe the passing situation of two
sides where every player serves as a vertex and every
pass serve as an edge. The width of the edge
represents the frequency of passes and analyzing the
whole network reflects the connection of each player
and the tactical features of each team.
To comprehensively evaluate the passing ability
of players, this paper selects the following ten
indicators as features of machine learning model and
calculates them through networkX in python3.6. The
following is the calculation method of all indicators
(Silva F G M, et al., 2018).
(1) Degree centrality: the sum of the direct
connections between a point and other points. It can