algorithm is able to classify a player’s action into ei-
ther backhand and forehand stroke with high preci-
sion and recall rates. The authors of (McQueen et al.,
2014) exploit players’ tracking data to recognize of-
fensive strategies in basketball through a linear SVM
classifier and a rule-based algorithm. In (Richly et al.,
2016), three machine learning approaches, namely
SVM, KNN, and RF, are experimented for the pur-
pose of classifying events in a soccer match, like
passes or receptions. The dataset used therein refers
to matches of the German Bundesliga, and contains
the timestamp, the two-dimensional coordinates of
the ball, a list of game events (e.g., fouls, substitu-
tions, offsides, etc.) and player involved. Event clas-
sification is accomplished by working with several
features computed by considering the raw position
data for the ball. To train the classifiers, the dataset
is annotated by manually identifying the events of in-
terest in the footage of three matches.
By building upon the works found in the literature,
this paper presents the design and evaluation of an im-
proved technique for the automatic classification of
sport events from spatio-temporal data. In particular,
given the promising results reported in (Richly et al.,
2016), this paper moves by considering the method-
ology developed in that work as a reference, and ex-
tends it to target a different sport, i.e., basketball. Af-
ter having experimented the same algorithms and the
same set of features used in the reference work on a
dataset containing position data from National Bas-
ketball Association (NBA) matches, this paper addi-
tionally proposes a new set of features, which proved
to significantly boost the performance of basketball
event recognition and classification. Finally, the pa-
per explores how automatically information extracted
can be used to support the job of both coaches and
players by enhancing the functionality of an existing
VR-based tool for tactics analysis.
3 METHODOLOGY
This section describes the dataset as well as the fea-
tures that have been developed/used in this paper.
3.1 Dataset
The original dataset refers to the 2015–16 season
of the NBA (https://github.com/sealneaward/nba-
movement-data/tree/master/data), and contains
spatio-temporal data collected at 20 Hz. Data are
structured in matches and actions (for a given match).
For each action, the position of the ball and of the
players is recorded. The dataset, stored as a .csv file,
consists of the following values:
• team
id
: identifier of the team to which player be-
longs to, −1 if the tracked object is the ball;
• player
id
: identifier of the tracked object, −1 if the
tracked object is the ball;
• x
loc
, y
loc
, z
loc
: 3D spatial position of the tracked
object (the z coordinate is provided only for the
ball);
• game
clock
: remaining time of the match;
• shot
clock
: remaining time of the 24 seconds
granted to a team to finalize an offensive action;
• quarter: quarter of the game;
• game
id
: identifier of the match;
• event
id
: identifier of the action in the game.
The coordinate system used for x
loc
and y
loc
is nor-
malized in the 0 − 100 and 0 − 50 range, respectively
for the x and y axis; the bottom-left corner is rep-
resented by point with (0, 0) coordinates. To create
the annotated dataset, sports events were manually
identified in the footage of the San Antonio Spurs vs
Minnesota Timberwolves match that was played on
December 23rd, 2015. Like in the reference work,
passes and receptions were considered. Other events,
like shots, dribbles, etc. were marked with the label
“other”. Part of the events belonging to the latter cat-
egory were randomly deleted, in order to balance the
frequency of the three events. At the end of the pro-
cess, the annotated dataset included 180 entries per
event category.
3.2 Features
According to the reference work, a sport event can
be recognized in a dataset containing spatio-temporal
data by analyzing the values of several features that
characterize it. Features have been extracted by run-
ning a script on the above data. For each time t in
the dataset, a vector is obtained containing a value for
each feature. Features used in this work can be cate-
gorized in five groups. The first group contains the
(”two-dimensional”) features directly derived from
the reference work. The remaining groups host the
new features that have been introduced in this paper.
In particular, the features in the second group are cal-
culated by considering only the movement of the ball
along the z axis; hence, they are referred to as “ver-
tical”. Features in the third group are those in the
second group, but adapted to a “three-dimensional”
space. For features in the fourth group, the posi-
tion of the players is also considered; thus, they are
Automatic Recognition of Sport Events from Spatio-temporal Data: An Application for Virtual Reality-based Training in Basketball
311