tournament were used to improve the proposed
models. Another application is the use of a maximum
score estimator to predict final scores (Caudill, 2003).
This is an improvement to a probit model which
forms a relationship between a team’s seed and the
probability of them winning (Boulier and Stekler,
1999). Not only within the basketball context, the
idea that a team’s ability or strength is something
dynamic and can fluctuate throughout the course of a
season or a tournament is applied via an extension of
the Bradley–Terry model for paired comparison data,
to model the outcomes of sport events while allowing
for time varying abilities through the use of weighted
moving averages (Catellan et al., 2013). This was
applied to the 2009-2010 NBA regular season
(basketball) along with the 2008-2009 Italian Serie A
season (football). The use of player-tracking data at
every moment in a team’s possession of the ball to
produce a quantity called expected possession value
(EPV), has also been applied (Cervone et al., 2014).
EPV is an expectation of how many points the
attacking team is expected to score by the end of the
possession. This quantity was first introduced to
football where it was considered quite a revolutionary
new metric as it provides a team with data regarding
what would happen on an average basis if the team
was scheduled for an infinite number of matches.
Now, it is slowly making its way over to other sports
including basketball.
One early attempt of the use of Bayesian
modelling in sports is a Bayesian framework to the
bivariate Poisson distribution (Tsionas, 2001), which
was originally applied in a frequentist context in
football games (Karlis and Ntzoufras, 2000; Karlis
and Ntzoufras, 2003). The influential seminal paper
on the use of Bayesian hierarchical modelling in
sports, where each individual team’s number of goals
scored is assumed to follow a Poisson distribution, is
applied to the Italian Serie A championship
1991/1992 (Baio and Blangiardo, 2010). There have
also been other approaches on the use of Bayesian
hierarchical models to predict the outcome of tennis
matches (Ingram, 2019) and women’s volleyball
(Gabrio, 2020). In the former, a Bayesian hierarchical
model based on the binomial distribution is used to
model the serve-match, and in the latter, a Bernoulli-
based Bayesian hierarchical model is used to model
the probability of playing five sets, and the
probability of winning a match. To our knowledge,
the Bayesian hierarchical modelling approach has not
been applied to the basketball context. The Bayesian
hierarchical Poisson model (Baio and Blangiardo,
2010) shall serve as the basis for modelling scoring
intensity, and this shall be extended to the negative
binomial approach. Furthermore, the Bernouilli-
based Bayesian hierarchical modelling approach
applied to volleyball (Gabrio, 2020) shall serve as the
backbone for modelling the winning probability.
3 BAYESIAN HIERARCHICAL
MODELLING OF SCORING
INTENSITY
A noteworthy difference between the goals scored in
football and basketball is that, in football you have
one method of increasing the number of goals in a
match, which always increments by a single value for
each goal, while in basketball there are three different
ways to score and how one can increase their team’s
point tally. These different ways would be the free
throw (1 point), the two-point shot, and the three-
point shot. Due to this difference, it was felt necessary
that each scoring method should be modelled
separately and in the end, the totals would be summed
up according to their respective weight in order to
obtain the predicted final score. We first start by
defining the Bayesian hierarchical Poisson model
applied to basketball, and then move on to extending
this to the negative binomial case.
3.1 The Poisson Model
In this study, three Poisson models separately shall be
considered (free throws made, two point shots made
and three point shots made):
𝐹𝑇
| 𝜃
~ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜃
)
𝑇𝑤𝑜𝑃𝑇
| 𝜃
~ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜃
) (1)
𝑇ℎ𝑟𝑒𝑒𝑃𝑇
| 𝜃
~ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛(𝜃
)
where 𝑔 represents the match index (in order of the
date and time they were played), 𝑗 represents whether
the team played at home or away (1 – home effect, 2
– away effect). 𝐹𝑇
, 𝑇𝑤𝑜𝑃𝑇
and 𝑇ℎ𝑟𝑒𝑒𝑃𝑇
represent the observed count for the free throws, two-
point shots and three-point shots made by team 𝑗 in
the g
th
match, respectively. 𝜃
, 𝜃
and
𝜃
represent the scoring intensity with
respect to free throws, two-point shots and
three-point
shots by team 𝑗 in the g
th
match, respectively. The
scoring intensity of the home and away team shall be
estimated by considering the attack and defense
ability for each team along with the home effect. The
models must also include an intercept common for
both scoring intensities due to the fact that basketball