Recognizing Compound Events in Spatio-Temporal Football Data

Keven Richly, Max Bothe, Tobias Rohloff and Christian Schwarz

Hasso Plattner Institute, University of Potsdam, Prof.-Dr.-Helmert-Str. 2-3, 14482 Potsdam, Germany

Keywords:

Football Analytics, Event Recognition, Spatio-Temporal Football Data, Supervised Machine Learning.

Abstract:

In the world of football, performance analytics about a player’s skill level and the overall tactics of a match are

supportive for the success of a team. These analytics are based on positional data on the one hand and events

about the game on the other hand. The positional data of the ball and players is tracked automatically by cam-

eras or via sensors. However, the events are still captured manually, which is time-consuming and error-prone.

Therefore, this paper introduces an approach to identify compound events by analyzing the positional data of

football matches. We trained and aggregated the machine learning algorithms Support Vector Machine, K-

Nearest Neighbors and Random Forest, based on features, which were calculated on the basis of the positional

data. To validate the feasibility of our approach we evaluated the quality of the results by comparing recall and

precision. We demonstrated that it is possible to detect compound events from spatio-temporal football data.

Nevertheless, the choice of a speciﬁc algorithm has a signiﬁcant impact on the quality of the predicted results.

1 INTRODUCTION

In recent years the use of spatio-temporal data

strongly increased in various areas. Especially in the

highly competitive sport sector new insights gained

by positional information of players – tracked by dif-

ferent systems and methods during a game – can have

a major impact on the training and tactic of a team.

For professional football clubs performance analysis

is an integral part of the coaching process (Carling

et al., 2005). In the context of performance anal-

ysis in football, many analyses are based on manu-

ally tracked and chronological ordered lists of game

events on the one hand or the positional informa-

tion of the players on the other hand (Mackenzie and

Cushion, 2013). For that reason, the signiﬁcance and

accuracy of analysis strongly correlates with the qual-

ity of the provided data. Detecting events manually

is a time-intensive and error-prone task. Based on

the data of matches of the German Bundesliga, we

discovered that the events are not time-synchronized

with the positional information and sometimes associ-

ated with the wrong player, due to the optical sensors

used for data recording.

Therefore, in this paper we present the imple-

mentation and evaluation of algorithms to automati-

cally detect events in the positional data of football

matches. We focused on following major events:

passes, receptions, shots on target and clearances in

this paper, since these ones are basic events, which

have a high probability to occur more often during a

match. We computed different features from the raw

positional data of the ball. Based on these features,

we detected event candidates by using different ma-

chine learning approaches – the Support Vector Ma-

chine (SVM), K-Nearest Neighbors (KNN) and Ran-

dom Forest (RForest) classiﬁcation. In order to train

these supervised learning techniques, we also created

manually a gold standard based on the positional data

and video data of the football matches. Additionally,

we evaluated the three machine learning approaches

by the recall and precision of the results.

The paper is organized in the following structure.

In Section 2 we examine related work. Afterwards,

we explain the properties of the provided data and

introduce the created gold standard. In following

section, we describe how the features are computed

based on the positional data and Section 5 shows how

we used these features to train different classiﬁcation

algorithms. We also provide an evaluation about the

quality of our results (see Section 6). Before we con-

clude the paper in Section 8, we give an overview

about future work.

2 RELATED WORK

Almost all system on the market use either cameras

or sensors to track the movements of athletes dur-

Richly, K., Bothe, M., Rohloff, T. and Schwarz, C.

Recognizing Compound Events in Spatio-Temporal Football Data.

DOI: 10.5220/0005877600270035

In Proceedings of the International Conference on Internet of Things and Big Data (IoTBD 2016), pages 27-35

ISBN: 978-989-758-183-0

ing professional matches or training sessions. In the

area of camera-based systems, various projects fo-

cused on the extraction of spatio-temporal data out

of video recordings (Mackenzie and Cushion, 2013;

Beetz et al., 2009; Barnard et al., 2003). On the ba-

sis of sensor-based data Gal et al. (Gal et al., 2013)

developed a system to detect shots on target. Also

Madsen et al. (Madsen et al., 2013) focused on shots

on target in connection with DEBS 2013 Grand Chal-

lenge (Christopher Mutschler et al., 2013). A disad-

vantage of sensor-based systems like the RedFir sys-

tem (von der Gr

un et al., 2011) is that it is forbidden to

equip players with sensors in matches of the German

Bundesliga. Another approach is to use positional

information to classify the quality of passes (Horton

et al., 2014). However, a precondition for the pre-

sented algorithm is that the exact timestamp and po-

sition of a pass is available in the provided data set.

Schuldhaus et al. (Schuldhaus et al., 2015) presented

a low-cost inertial sensor-based approach to identify

passes and shots on target. They compared the three

machine learning algorithms support vector machine,

classiﬁcation and regression tree, and Naive Bayes

to recognize the compound events. The used sen-

sors provide additionally the values of a triaxial ac-

celerometer and a triaxial gyroscope.

There are also several research projects on other

sports. Kautz et al. (Kautz et al., 2015) used differ-

ent supervised machine learning algorithms to recog-

nize tackles and scrums in Rugby matches. A system

to classify strokes in tennis was developed by Con-

naghan et al. (Connaghan et al., 2011). The algo-

rithm for the detection is based on thresholds, which

is suitable for tennis but due to the highly dynamic

and complex motions of players not applicable for

football matches (Schuldhaus et al., 2015). Jiang and

Yin (Jiang and Yin, 2015) presented an algorithm that

uses deep convolutional neural networks to recognize

events in the data provided by wearable sensors. A

classiﬁcation of human activities based on support

vector machines was presented by Anguita et al. (An-

guita et al., 2012). Peterek et al. (Peterek et al., 2014)

also focused on the detection of human activities, but

they used an approach based on the random forest al-

gorithm.

3 DATA FOUNDATION

As mentioned before, there are various providers of

spatio-temporal data for professional football games.

The quality, granularity, and accuracy of the data vary

between different competitors and also strongly de-

pend on the used tracking technology. The provided

data sets typically consist of the positional informa-

tion of the players and the ball, the manually tracked

list of game events as well as some meta data about

the teams and players. In this paper, we focus on

data of games of the German Bundesliga. Deﬁned by

the pitch size, the range of the two-dimensional co-

ordinates goes from −52.5 to 52.5 for x and the data

range of y goes from −34 to 34 (for pitches of the size

of 105m ∗ 68m). Since the pitch size is not exactly

deﬁned, these numbers can differ for other stadiums.

The center of the pitch has the coordinates (0, 0). The

position values can exceed these limits. This indicates

that the ball went out of bounds. Figure 1 shows a

football pitch and the coordinates of its bounds.

(0|0)

(52.5|34.0)

(52.5|-34.0)

(-52.5|34.0)

(-52.5|-34.0)

Figure 1: Football pitch with dimensions of bounds.

The list of game events includes the timestamp,

event type and involved players. All events are clas-

siﬁed in the categories pass, shot on target, neutral

contact, clearance, duel, foul, offside, caution, and

substitution. Several events, such as fouls, cautions

or substitutions, cannot be detected just by the posi-

tional data of the ball and players. They also depend

on other information, e.g. the signals of the referee.

Additionally, the events are not synchronized with the

positional information. The delay can be up to several

seconds. To evaluate and train the supervised machine

learning algorithms, we created manually a gold stan-

dard based on the video recordings of the games and

by taking into consideration the acceleration values

of the ball. The gold standard includes the following

three game sections:

• Set A Hertha BSC vs. 1. FSV Mainz 05

Season 2014/15, Time: 00:00 - 03:08

• Set B Hertha BSC vs. 1. FSV Mainz 05

Season 2014/15, Time: 25:00 - 31:42

• Set C Hertha BSC vs. Eintracht Braunschweig

Season 2013/14, Time: 70:00 - 73:20

From the selected sections, we excluded the times,

when the ball was out of bounds or the game was

IoTBD 2016 - International Conference on Internet of Things and Big Data

paused. Afterwards we compared the gold standard

with the provided event list. We were able to ﬁnd 121

out of 194 (62.4%) matching events, within a time

period of two seconds and with the same event type

as our event. These events had an average time de-

lay of 0.77 seconds. As a next step we examined

the assigned player for these events. For the matched

events, 18 out of 121 (14.9%) players were assigned

wrong.

Table 1: Tagged events for gold standard.

Set A Set B Set C Total

Pass 49 36 50 135

Reception 17 17 12 46

Clearance 0 5 1 6

Shot on Target 2 3 2 7

Total Events 68 61 65 194

Played Time 3:08 min 6:42 min 3:20 min 13:10 min

Excluded Time 0:58 min 1:49 min 1:36 min 4:23 min

Total Time 2:10 min 4:53 min 1:44 min 8:47 min

4 FEATURE COMPUTATION

Events in football matches are characterized by mul-

tiple features of the tracked objects. These objects

move on the football pitch and inﬂuence each other.

Events occur when one or multiple features show a

speciﬁc value or change at the same time. In this sec-

tion, we present the deﬁnition of the features we im-

plemented. All features are computed based on the

positional data described in the previous section. The

positional data is received per tracked object in a 2-

by-n matrix where n is the number of collected data

points in a time period. Each column vector repre-

sents the position of the object o at time t.

Pos

o,n



o,t

·· · x

o,t

·· · y

o,t



(1)

For computing the features we used the Python

framework Theano (Bergstra et al., 2010). It pro-

vides several functionalities such as transparent use

of the GPU. Theano also offers symbolic differentia-

tion. This allowed us to deﬁne the features in a func-

tional way and defer the execution. Due to the shape

of the positional data, we were able to use convolution

and other matrix operations to compute the features,

which depend on multiple data points efﬁciently.

We used the three types of convolution kernels to

combine adjacent values. The ﬁrst one computes the

difference of two consecutive values in a row vector

(ker

). The second kernel computes the sum of two

consecutive values in a row vector (ker

) and the third

kernel computes the sum of two consecutive values in

a column vector (ker

We can derive the following deﬁnitions from the

received positional data. The position of object o at

time t is deﬁned as p(o, t). Whereas the horizontal

position of object o at time t is p

(o,t) and the verti-

cal position of object o at time t is p

(o,t). The dif-

ference d between two consecutive data points equals

the movement of an object in 10

−1

seconds. This in-

dicates the direction of the object o at time t

d(o, t

) = p(o,t

n+1

) − p(o, t

) (2)

We used ker

to compute the direction of an object.

4.1 Velocity

It is possible to reuse the direction of the object in or-

der to determine the velocity of an object. Due to the

provided data format, we multiply the length of the

direction vector by 10 to retrieve the unit m ∗ s

−1

. For

velocity computation we used ker

. The velocity v of

object o at time t is deﬁned as followed:

v(o,t) = |d(o, t)| ∗ 10 (3)

4.2 Acceleration

With the velocity computed, we can now take the dif-

ference of two consecutive velocity values to get the

acceleration value in m ∗ s

−2

. For computing the ac-

celeration ker

is used. The acceleration a of object o

at time t

is deﬁned as followed:

a(o,t

) =

v(o,t

) − v(o, t

n−1

−t

n−1

(4)

4.3 Acceleration Peaks

Due to the sampling rate of 10 Hz it can occur that the

acceleration of an object is captured in multiple spa-

tial data points. Therefore, we aggregated two con-

secutive values in order to ﬁnd the real acceleration.

The aggregation for maximum and minimum values

has to be done separately. We ignored negative val-

ues for computing maximum peaks and positive peaks

for computing minimum peaks by setting them to 0.

In this way, a not existing acceleration peak is repre-

sented by the value 0. We used ker

to determine ac-

celeration peaks. The maximum and minimum peak

value - a

max

and a

min

- of object o at time t

n+1

are

deﬁned in the following way:

max

(o,t

n+1

) =

∑

x∈{t

n+1

}

max(0, a(o, x)) (5)

min

(o,t

n+1

) =

∑

x∈{t

n+1

}

min(0, a(o, x)) (6)

Recognizing Compound Events in Spatio-Temporal Football Data

Furthermore, we prevented that acceleration peaks

are detected at two consecutive timestamps. An accel-

eration peak is considered as real if there are no higher

acceleration peaks at adjacent timestamps. Therefore,

real acceleration peaks a

real

are deﬁned as followed:

max

real

(o,t

n+1

) =











max

(o,t

n+1

) if a

max

(o,t

n+1

) > a

max

(o,t

)

∧a

max

(o,t

n+1

) > a

max

(o,t

n+2

)

0 else

(7)

min

real

(o,t

n+1

) =











min

(o,t

n+1

) if a

min

(o,t

n+1

) > a

min

(o,t

)

∧a

min

(o,t

n+1

) > a

min

(o,t

n+2

)

0 else

(8)

4.4 Direction Change

While objects move on the football pitch they will

eventually change their direction d. A linear move-

ment results in no signiﬁcant change of the direction

feature, whereas rapid movement tends to have a no-

table change of direction. We computed the change

of direction as visualized in Figure 2.

Figure 2: Direction change of object.

Given the three position data points P

= p(o,t

= p(o, t

) and P

= p(o, t

), the ﬁrst direction vec-

tors are deﬁned as d

= d(o, t

) and d

= d(o, t

). The

angle created by d

and d

is the change of direction

. Possible values for direction changes are in the

range from 0 to 180.

To determine the direction change value, the

arccos function is applied to the quotient of the scalar

product of d

and d

and the product of length of

and d

. The direction change dc of object o at

time t

n+1

is deﬁned in the following way:

dc(o, t

n+1

) = arccos



d(o, t

) ∗ d(o,t

n+1

)

|d(o, t

)| ∗ |d(o,t

n+1



(9)

The direction change is computed by using ker

and ker

as well as the Hadamard product.

4.5 Distance to Target

The players try to score in one of the goals on the

pitch. These goals are considered as possible targets.

While playing the object will move towards one of

the targets. The corresponding target is chosen with

regard to the horizontal movement of the object. This

is independent of the position of the object. If the

object moves to the left side, the left goal is assigned

as target and vice versa. The variable width of the

pitch is deﬁned by wp. The reference point g of a

target is located middle of the goal line and is deﬁned

as followed:

g(o,t) =



sign(d

(o,t)) ∗



(10)

Figure 3 displays different situation and the dis-

tances to the target. The four position data points

= p(o

), P

= p(o

), P

= p(o

) and P

p(o

) and the two targets T



−



and T





. The arrow at each position data point repre-

sents the approximate direction of the respective ob-

ject at the same time. The objects at P

and P

have

a positive horizontal movement (d

(o,t) > 0). There-

fore the corresponding target for these two points is

. The object P

has a negative horizontal movement

(o,t) < 0). Its target is T

. The object at P

has no

horizontal movement (d

(o,t) == 0). This is a special

case where no target can be determined. The distance

to target feature will return in f inity.

Figure 3: Distance for object to target.

In cases where a target can be determined, the dis-

tance to target value is equal the length of the vector

subtraction of the current point of the object and the

position of its target. We used ker

and ker

for the

distance to target computation. The distance to target

value dt for object o at time t is deﬁned in the follow-

ing way:

dt(o, t) = |p(o, t) − g(o,t)| (11)

IoTBD 2016 - International Conference on Internet of Things and Big Data

4.6 Cross on Target Line

As discussed in the previous subsection, the objects

on the pitch are alternately targeting one of two tar-

gets. Beside the distance of the object to the target,

another feature is the proximity of the movement to

the target. We deﬁned this as the distance from the

target to the point where the object will cross the goal

line assumed that the object will maintain its direc-

tional movement.

Figure 4 shows the position P

= p(o,t

) of ob-

ject o and its directional movement d

= d(o,t

). If

the object continues its movement without changing

the direction, it will cross the goal line at C

. The dis-

tance between the target T

and C

is the measurement

for this feature.

ctl

Figure 4: Cross on target line of object.

To compute the cross target line feature, we had to

solve a linear equation (cf. Equation 12). A multiple

of the direction vector is added to the position of the

object until it reaches the goal line at any point. The

vertical difference to the target point is the desired dis-

tance. The cross target line feature ctl for object o at

time t is deﬁned as followed:



(o,t)

ctl(o,t)



= p(o, t) + s ∗ d(o,t) (12)

Repositioned for ctl:

ctl(o,t) = p

(o,t) + d

(o,t)

(o,t) − p

(o,t)

(13)

5 EVENT DETECTION

The most central object of a football match is the ball.

All players try to interact with it. The ball is also the

object that shows the most and highly rapid move-

ments on the pitch. Therefore, we computed all fea-

tures described in Section 4 for the ball. With all fea-

tures we created a vector for every time t containing

all corresponding feature values.

Velocity and acceleration describe the current mo-

mentum. Acceleration peaks were introduced due to

the provided data schema, since they are a strong in-

dicator for interactions with the ball. The direction

change feature covers ball interactions with high in-

tensity (e.g. passes) as well as ball interactions with

little intensity (e.g. ball touches during dribbling).

The distance to target is important to distinguish be-

tween shots on target and clearances and is an indica-

tor for the likelihood of a shot on target in comparison

to a pass. The cross on target line feature represents a

measurement whether a shot will hit the target or not.

Each vector describes an instant of the football match

and consecutive vectors can represent a certain event.

Depending on the type of the event, features become

more or less important and have characteristic values.

A naive approach to classify events would be:

• Pass: The ball has an acceleration peak with a

minimum value and/or shows a signiﬁcant direc-

tion change.

• Reception: The ball shows negative acceleration

peak or direction change. Afterwards the ball

stays close to a speciﬁc player.

• Shot on Target: The ball is accelerated with a

medium to high value and a direction change. A

shot on the target occurs most likely within a short

distance to target and they are aiming for target.

• Clearance: The ball has a high positive accel-

eration peak and direction change. Clearances

mostly happen to prevent risky situations near to

the own goal line. Therefore, they have a high

distance to target. In addition, clearances tend to

change the target direction of the ball.

To determine an exact differentiation between

the events, we selected three supervised classiﬁca-

tion machine learning algorithms based on related

work and common approaches: Support Vector Ma-

chine, K-Nearest Neighbors and Random Forest. We

used the implementations of the Python package

Sklearn (Pedregosa et al., 2011) to deﬁne the follow-

ing output classes: no event, pass, reception, shot on

target and clearance.

5.1 Supervised Learning Algorithms

In the following section, we explain the three super-

vised learning algorithms we used and their conﬁgu-

ration.

The Support Vector Machine (Cortes and Vapnik,

1995) approach tries to divide the data points in a

space into categories based on the provided training

data. Thereby the dividing gap has to be as wide as

possible. SVMs are effective in a high dimensional

Recognizing Compound Events in Spatio-Temporal Football Data

space, which is provided by the values in the vector

of features. We used SVM with a linear kernel. This

results in a linear divider gap. Furthermore, we used

the provided option to determine the weight of each

output class automatically. This prevents an over-

weighting of classes with a high frequency (e.g. no

events). We did not limit the number of iterations.

The K-Nearest Neighbors algorithm (Altman, 1992)

determines the output class by a majority vote of the

closest neighbors of a data point. Our results showed

that a conﬁguration with k = 3 will return the best re-

sults. A Random Forest approach (Breiman, 2001)

consists of multiple decision trees. These decision

trees are a very similar implementation to the naive

event classiﬁcation we present earlier in this section.

The RForest increases the predictive accuracy and

controls over-ﬁtting. In order to add more over-ﬁtting

prevention, we limited the number of decision trees to

10. Each decision tree has a maximal depth of 4. As

describe for the SVM algorithm, we use the provided

option to automatically determine the weight of the

output class.

5.2 Event Candidate Aggregation

The three classiﬁer algorithms return their prediction

for the test data set. In order to increase the accuracy

we allow an aggregation of these results. A customiz-

able weight is assigned to each classiﬁer algorithm

for each event type. In addition every event type has

a minimum score. If the classiﬁer predicted an event,

the weight is added to the score of this event at that

time. As a ﬁnal result only events with a score equal

or greater the minimum score are considered as de-

tected events. This allows us to add more prediction

algorithms to our implementation and integrate their

results, depending on how precise or complete their

results are for certain event types.

6 EVALUATION

In the following section, we show an evaluation of

the three event detection approaches we described in

Section 5. We assessed their quality by precision and

recall, which are deﬁned as followed:

precision =

true positives

true positives + f alse positives

(14)

recall =

true positives

true positives + f alse negatives

(15)

For the detailed evaluation of the results we fo-

cused on the event types to passes and receptions. We

also introduced the overall type of ball touches. Our

assumption was, that all four event types have simi-

lar aspects for their feature characteristics. Therefore,

this event type represents the ability of an algorithm to

distinguish between an event occurrence and an event

absence. The input set is the union of all four initial

event types.

We evaluated different data sets, which are de-

scribed in Section 3. On the one hand, we trained and

tested on the same data set. We learned from 90% of

the data and tested on 10% of the data. We split the

data set randomly 100 times and calculated the arith-

metic mean for the precision and recall of all itera-

tions. The repetition of the random split proceeding

should ensure that our results are statistically compre-

hensible and deviations are mitigated. On the other

hand, we trained the events from one set and tested

it on another. We wanted to examine if the results

change when time has passed during a game or when

it is another game with different players. For this vari-

ant, one iteration was needed, since no random split

was processed.

Figure 5 shows the precision and recall for passes,

receptions and ball touches tested within the same

data sets. The dashed lines inside the diagrams also

indicate the value of the f -measure ( f 1 score), which

is the harmonic mean of precision and recall:

f = 2 ∗

precision ∗ recall

precision + recall

(16)

In Table 2 we compare the algorithms quality by

the arithmetic means of the precision and recall for

the prediction within the data sets. It turns out that

the KNN algorithm offers results with a quite low

quality. The recall never exceeds 10%. The preci-

sion for passes and ball touches is between 32% and

43.8%. Both precision and recall are 0% for recep-

tions. The SVM and RForest results are close to each

other and have a noticeable higher recall than KNN.

The recalls are between 59.9% and 76.7%, except for

RForest receptions where it is 43.6%. The precisions

are between 35.1% and 38.7%. An exception is again

receptions, where the precision is between 5.9% and

8.9%.

The precision and recall for the prediction across

data sets can be seen in Table 3, which is the summa-

rization of Figure 6. The magnitudes and ﬂuctuations

of the results are similar to the prediction within data

sets results, which were explained before. In sum-

mary, the quality is slightly lower than previously.

The overall precision is around 1% lower and the re-

call is 9% lower. This could be caused by the already

mentioned effect, that different players show different

IoTBD 2016 - International Conference on Internet of Things and Big Data

(a) Pass (b) Reception (c) Ball touch

Figure 5: Event recognition within data sets.

(a) Pass (b) Reception (c) Ball touch

Figure 6: Event recognition across data sets.

Table 2: Algorithms quality for prediction within data sets.

Algorithm Event Type Precision Recall F-Measure

KNN

pass 0.320 0.088 0.138

reception 0.000 0.000 0.000

ball touch 0.438 0.095 0.156

SVM

pass 0.387 0.641 0.483

reception 0.059 0.642 0.108

ball touch 0.305 0.716 0.428

RForest

pass 0.354 0.767 0.484

reception 0.089 0.436 0.148

ball touch 0.351 0.595 0.442

skills and players get exhausted over the period of a

match.

Finally, we aggregated the results for the predic-

tion across data sets, as explained in Section 5.2.

Since the precisions for the different event types are

nearly the same for the three algorithms (cf. Table

3), we used as a conﬁguration for the aggregation a

Table 3: Algorithms quality for prediction across data sets.

Algorithm Event Type Precision Recall F-Measure

KNN

pass 0.356 0.055 0.095

reception 0.000 0.000 0.000

ball touch 0.422 0.076 0.128

SVM

pass 0.398 0.666 0.498

reception 0.031 0.327 0.057

ball touch 0.272 0.651 0.383

RForest

pass 0.311 0.727 0.435

reception 0.057 0.124 0.078

ball touch 0.373 0.544 0.443

weight of 3

−1

for every algorithm and a minimum

score of 2 · 3

−1

. Therefore, two out of three algo-

rithms have to ﬁnd an event at a given timestamp, in

order to conﬁrm this event. Table 4 shows that we

increased the f -measure for passes and ball touches

compared to all three algorithms between 1.6% and

41.9% (17.1% average). We also increased the f -

Recognizing Compound Events in Spatio-Temporal Football Data

measure for receptions compared to the KNN (6.7%)

and SVM (1%) algorithm, but decreased it about 1%

for RForest.

Table 4: Aggregation quality for prediction across data sets.

Event Type Precision Recall F-Measure

pass 0.426 0.647 0.514

reception 0.047 0.119 0.067

ball touch 0.372 0.713 0.489

To summarize, our results show that it is possi-

ble to detect football events from positional data with

our approach based on supervised learning. Also the

choice of a speciﬁc algorithm can have an extensive

impact on the quality of the predicted results. The

SVM and RForest algorithms showed reasonable re-

sults, whereas the KNN algorithm failed to convince

us for this use case. With the aggregation of the differ-

ent algorithms the results could be further improved.

Therefore, a valuable conﬁguration of the weights and

the minimum score is needed. Nevertheless, there is

still potential for optimizations.

7 FUTURE WORK

In this section, we face issues discovered and suggest

proceedings to further improve and extend our work.

The missing z coordinate (height information) is nec-

essary to calculate features such as velocity, acceler-

ation and direction change correctly, since the ball is

moving in a three-dimensional space. When the ball

is out of bounds or the game has stopped the posi-

tional data should be ignored, because the data is in-

appropriate to draw conclusions about that. We also

discovered problems with the distinction of receptions

that occur directly before a pass, which happens when

a player receives the ball and immediately shoots it to

another player. A higher data resolution might help to

mitigate this issue.

To extend our approach it would be possible to

train the features for every player separately, since ev-

ery player has a different skill and shows its own be-

havior. It is also possible that a player changes its be-

havior between matches or even during a match, when

a player gets exhausted, or the teams adapt their tac-

tics to the game. This would make it even more chal-

lenging to classify the different event types. To val-

idate an event candidate it would be an extension to

integrate what is happening before and after the event

time and what the players around the event position

are doing. A good candidate to weight the features oc-

curing in a speciﬁc time window is the inter-quartile

range coverage algorithm (Schwarz et al., 2012a),

which is useful for detection of patterns with known

features in time-series data (Schwarz et al., 2012b).

We focused mainly on the moment of an event iden-

tiﬁed by the features of the ball, but probably more

information can be gathered this way.

8 CONCLUSION

In this paper, we proposed to detect events from po-

sitional data of football matches, in order to replace

the error-prone and time-consuming task of captur-

ing these events manually. We presented a super-

vised machine learning approach that classiﬁes com-

pound football on the base of different features, which

are computed from positional data. To calculate

those features efﬁciently for over one million tracking

events per match, we used matrix operations, particu-

larly convolution, and optional GPU features to paral-

lelize and speedup the calculation. We used the Sup-

port Vector Machine, K-Nearest Neighbors and Ran-

dom Forest classiﬁcation algorithms to recognize the

event classes of passes, receptions, shots on target and

clearances in our self-captured gold standard. Addi-

tionally, we enhanced the approach with a customiz-

able aggregation algorithm, to be able to weight the

outcome of the algorithms for different event types.

We evaluated the three algorithms by their qual-

ity of precision and recall. Compared to KNN algo-

rithm, the SVM and RForest algorithms showed rea-

sonable results with a precision of 31.1% up to 39.8%

and a recall of 66.6% up to 72.7% for passes. Recep-

tions seemed to be difﬁcult to distinguish from passes

for the classiﬁcation algorithms, which is why we re-

ceived lower results for them. We further improved

the results with the aggregation of the different algo-

rithms and increased the f -measures with an average

of 12.1% for all event types compared to the three

algorithms. To improve our approach for a practi-

cal use, we showed different ideas for future work.

In conclusion, our results showed that it is possible

to detect football events from positional data, but the

choice of a speciﬁc algorithm can have an extensive

impact on the quality of the predicted results.

REFERENCES

Altman, N. S. (1992). An introduction to kernel and nearest-

neighbor nonparametric regression. The American

Statistician, 46(3):175–185.

Anguita, D., Ghio, A., Oneto, L., Parra, X., and Reyes-

Ortiz, J. L. (2012). Human activity recognition

IoTBD 2016 - International Conference on Internet of Things and Big Data

on smartphones using a multiclass hardware-friendly

support vector machine. In Ambient assisted living

and home care, pages 216–223. Springer.

Barnard, M., Odobez, J.-M., and Bengio, S. (2003). Multi-

modal audio-visual event recognition for football

analysis. In Neural Networks for Signal Processing,

2003. NNSP’03. 2003 IEEE 13th Workshop on, pages

469–478. IEEE.

Beetz, M., von Hoyningen-Huene, N., Kirchlechner, B.,

Gedikli, S., Siles, F., Durus, M., and Lames, M.

(2009). Aspogamo: Automated sports game analysis

models. International Journal of Computer Science in

Sport, 8(1):1–21.

Bergstra, J., Breuleux, O., Bastien, F., Lamblin, P., Pascanu,

R., Desjardins, G., Turian, J., Warde-Farley, D., and

Bengio, Y. (2010). Theano: A CPU and GPU Math

Compiler in Python. In Proceedings of the Python for

Scientiﬁc Computing Conference (SciPy), 9.

Breiman, L. (2001). Random forests. Machine Learning,

45(1):5–32.

Carling, C., Williams, A. M., and Reilly, T. (2005). Hand-

book of soccer match analysis: A systematic approach

to improving performance. Psychology Press.

Christopher Mutschler, Holger Ziekow, and Zbigniew

Jerzak (2013). The DEBS 2013 Grand Challenge.

In ACM, editor, Proceedings of the 7th ACM Inter-

national Conference on Distributed Event-Based Sys-

tems, pages 289–294.

Connaghan, D., Kelly, P., O’Connor, N. E., Gaffney, M.,

Walsh, M., and O’Mathuna, C. (2011). Multi-sensor

classiﬁcation of tennis strokes. In Sensors, 2011

IEEE, pages 1437–1440. IEEE.

Cortes, C. and Vapnik, V. (1995). Support-vector networks.

Machine Learning, 20(3):273–297.

Gal, A., Keren, S., Sondak, M., Weidlich, M., Blom, H., and

Bockermann, C. (2013). Grand challenge: The tech-

niball system. In Proceedings of the 7th ACM interna-

tional conference on Distributed event-based systems,

pages 319–324. ACM.

Horton, M., Gudmundsson, J., Chawla, S., and Es-

tephan, J. (2014). Classiﬁcation of passes in foot-

ball matches using spatiotemporal data. arXiv preprint

arXiv:1407.5093.

Jiang, W. and Yin, Z. (2015). Human activity recognition

using wearable sensors by deep convolutional neural

networks. In Proceedings of the 23rd Annual ACM

Conference on Multimedia Conference, pages 1307–

1310. ACM.

Kautz, T., Groh, B. H., and Eskoﬁer, B. M. (2015). Sensor

fusion for multi-player activity recognition in game

sports.

Mackenzie, R. and Cushion, C. (2013). Performance

analysis in football: A critical review and implica-

tions for future research. Journal of Sports Sciences,

31(6):639–676.

Madsen, K. G. S., Su, L., and Zhou, Y. (2013). Grand chal-

lenge: Mapreduce-style processing of fast sensor data.

In Proceedings of the 7th ACM international confer-

ence on Distributed event-based systems, pages 313–

318. ACM.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,

Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,

Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,

Cournapeau, D., Brucher, M., Perrot, M., and Duch-

esnay, E. (2011). Scikit-learn: Machine Learning

in Python. Journal of Machine Learning Research,

12:2825–2830.

Peterek, T., Penhaker, M., Gajdo

s, P., and Dohn

alek, P.

(2014). Comparison of classiﬁcation algorithms for

physical activity recognition. In Innovations in Bio-

inspired Computing and Applications, pages 123–131.

Springer.

Schuldhaus, D., Zwick, C., K

orger, H., Dorschky, E., Kirk,

R., and Eskoﬁer, B. M. (2015). Inertial sensor-based

approach for shot/pass classiﬁcation during a soccer

match.

Schwarz, C., Leupold, F., and Schubotz, T. (2012a). Short-

Term Energy Pattern Detection of Manufacturing Ma-

chines with In-Memory Databases - A Case Study.

In ENERGY 2012: The Second International Confer-

ence on Smart Grids, Green Communications and IT

Energy-aware Technologies, pages 7–12.

Schwarz, C., Leupold, F., Schubotz, T., Januschowski, T.,

and Plattner, H. (2012b). Rapid Energy Consumption

Pattern Detection with In-Memory Technology. Inter-

national Journal on Advances in Intelligent Systems,

5:415–426.

von der Gr

un, T., Franke, N., Wolf, D., Witt, N., and Eid-

loth, A. (2011). A real-time tracking system for foot-

ball match and training analysis. In Microelectronic

systems, pages 199–212. Springer.

Recognizing Compound Events in Spatio-Temporal Football Data