
capitalization is £22.5billion. For further discussion
of BetFair, see e.g. (Davies et al., 2005; Houghton,
2006; Cameron, 2009).
Creating an online exchange for matching layer
and backer bettors was not the only innovation that
BetFair introduced. They also led in the develop-
ment of in-play betting, which allowed bettors to con-
tinue to place back and lay bets after a sports event
had started, and to continue betting as the event pro-
gressed, until some pre-specified cut-off time or sit-
uation occurred, or the event finished. This is in
contrast to conventional human-operated bookmak-
ers, who ceased to take any further bets once the event
of interest was underway: because Betfair’s betting
exchange system was entirely automated, it could pro-
cess large numbers of bets while an event is underway,
operating in real time with flows of information that
would overwhelm a human bookie.
Just as most stock-exchanges publish real-time
summary data of all the bids and asks currently seek-
ing a counterparty, often showing the quantity avail-
able to be bought or sold at each potential price for a
particular stock, so a betting exchange publishes real-
time summary data for any one event E showing all
the currently unmatched backs and lays, the odds (or
“price”) for each of them, and the amount of money
available to be wagered at each price – in the termi-
nology of betting exchanges, this collection of data is
the “market” for event E.
During in-play betting, the prices in the market
can shift rapidly, and while some types of events such
as tennis matches might last for hours, allowing for
hours of in-play betting to endure for a single match,
for other types of event such as horse-racing the event
may only last a few minutes. The exploratory work
that we describe in this paper is motivated by the hy-
pothesis that it may be possible to use machine learn-
ing (ML) methods to process the rapidly-changing
data on a betting-exchange market for short-duration
events such as horse races, and for the ML system to
thereby produce novel profitable betting strategies.
For the rest of this paper, without loss of gener-
ality, we’ll limit our descriptions to talking only of
betting on horse races because this is a very widely
known form of sport on which much money is wa-
gered, because the duration of most horse races is
only a few minutes, and also because it is reasonably
easy to create an appropriately realistic agent-based
model (ABM), a simulation of a horse race, where
each agent in the model represents a horse/rider com-
bination, and where during the race each agent has
a particular position on the track, is travelling at a
specific velocity, and may or may not be blocked or
otherwise influenced by other horses in the race. Ex-
actly such a simulation of a horse race was introduced
by Cliff (Cliff, 2021), as one component of the Bris-
tol Betting Exchange (BBE), an agent-based simula-
tor not only of horse races, but also of a betting ex-
change, and also of a population of bettor-agents who
each form their own private opinion of the outcome
of a race, and then place back or lay bets accordingly.
Various implementations of BBE have been reported
previously by (Cliff et al., 2021) and at ICAART2022
by (Guzelyte and Cliff, 2022), but to the best of our
knowledge ours is the first study to explore use of XG-
Boost (Chen and Guestrin, 2016) on in-play betting
data to develop profitable betting strategies.
The bettor-agents in the BBE ABM each form
their own private opinion on the outcome of a race
on the basis of their own internal logic, i.e. their own
individual betting strategy, and the original specifica-
tion of BBE in (Cliff, 2021) included a number of
minimally simple strategies, described in more detail
in Section 2.3 below, and the BBE ABM usually oper-
ates with a bettor population having a heterogeneous
mix of such betting strategies. As the dynamics of
a simulated race unfold, so the bettor-agents react to
changes in the competitors’s pace and relative posi-
tions by making and/or cancelling in-play bets, alter-
ing the market for that race. The BBE ABM records
every change in the market over the duration of a race,
along with the rank-order positions of the competitors
at the time of each market change (i.e., which com-
petitor is in first place, which is in second, and so on):
this we refer to as a race record.
In the work described here, we typically run 1000
race simulations, gathering a race record from each.
The set of race records then go through an automated
analysis process to identify the actions of the most
profitable bettors in each race. That is, for each race,
we look to see which bettors made the most profit
from in-play betting on that race, and we then work
backward in time to see what actions those bettors
took during the race, and what the state of the mar-
ket and the state of the race was at the time of each
such action. This then forms the training and/or test
data for XGBoost: for any one item of such data, the
input to XGBoost is the state of the market and the
state of the race, and the desired output is the action
that the bettor took.
To accomplish this, we modified the existing
source-code of the most recent version of BBE, which
is the multi-threaded BBE integrated with Opin-
ion Dynamics Platform used in Guzelyte’s research
(Guzelyte, 2021b; Guzelyte and Cliff, 2022), hosted
on Guzelyte’s GitHub (Guzelyte, 2021a), to incorpo-
rate the XGBoost betting agent. After the integra-
tion of XGBoost, we conduct experiments to train and
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
160