The Effect of Team Record on Fan Loyalty in the National

Football League

Rahul Mani

and Vinod Dubey

McLean High School, McLean, Virginia, U.S.A.

Department of Computer Science, George Mason University, Fairfax, Virginia, U.S.A.

Keywords: Fan Loyalty, Sports Data Mining, Predictive Modelling, National Football League.

Abstract: The paper explores the relationship between a team’s performance in the National Football League (NFL) in

terms of win and loss records and fan loyalty. It examines to what extent winning matters in order to sustain

fan loyalty and what is therefore the incentive for the owners and players to improve a team’s performance.

This research uses computer data mining and predictive modelling techniques through JAVA Programming

to answer this question. Linear and Quadratic regression analysis are undertaken to see if the results differ for

teams with a winning versus a losing record. The contribution of this paper is to establish that fan attendance

at home games can be significantly improved by the winning record of the team.

1 INTRODUCTION

Football is an American passion. Every season,

millions of fans rush to the stadiums to see their

favourite teams play, and hundreds of millions watch

the games on their televisions. The recent explosion

of fantasy football as a popular new pastime of

millions also attests to the increasing popularity of

NFL amongst its keen followers. Scores of other fans

are now using the Social Media like Twitter and

Facebook to express support for the teams.

While each NFL team and its performance plays

a huge part in its fans’ lives, interestingly we observe

that every year at the end of an NFL season there are

the usual elite teams like the New England Patriots,

Green Bay Packers, Baltimore Ravens, Indianapolis

Colts, Pittsburgh Steelers etc., that are most

successful, ending up in the playoffs and even making

a run at the Super Bowl (Table 1). On the other hand,

most other teams seem to have a habit of experiencing

moderate to poor season year in and year out. One is

therefore left to wonder, what keeps the fans loyal to

a team despite its poor performance year in and year

out?

Maintaining a loyal fan base should be of utmost

importance to every owner and manger as that is most

likely a key factor driving the earnings from the team.

This would show up in, for example, ticket sales

(attendance to the home games), TV viewership

Table 1: Super Bowl Winners (2000-2013).

Team Year

Baltimore Ravens 2000

New England Patriots 2001

Tampa Bay Buccaneers 2002

New England Patriots 2003

New England Patriots 2004

Pittsburgh Steelers 2005

Indianapolis Colts 2006

New York Giants 2007

Pittsburgh Steelers 2008

New Orleans Saints 2009

Green Bay Packers 2010

New York Giants 2011

Baltimore Ravens 2012

Seattle Seahawks 2013

(showing excitement surrounding the games) and

purchasing of a team’s merchandise (Jerseys, Caps

etc.). Fan loyalty, one would think, would be driven

to a large extent by how successful a team is. One

could therefore assume that some of the winningest

teams in the league would command stronger fan

loyalty than some of the lower ranked teams which

would most-likely see their fan loyalty and

Mani, R. and Dubey, V..

The Effect of Team Record on Fan Loyalty in the National Football League.

In Proceedings of the 3rd International Congress on Sport Sciences Research and Technology Suppor t (icSPORTS 2015), pages 257-262

ISBN: 978-989-758-159-5

257

subsequently fan attendance dwindle. Therefore from

a revenue point of view, it is in the interest of the

owner and/or manager/coach of a team to improve its

winning record in order to improve fan loyalty or at

least maintain a high level of loyalty.

The objective of this paper is to test if the fan

loyalty of a NFL team is largely determined by its

success in the field. Fan loyalty is being defined here

as the percentage of stadium capacity filled in the

home attendance for each team from 2005-2013. This

definition is most effective for the purpose of this

paper as the attendance data is the most

comprehensive and the easiest to interpret.

Percentage of stadium capacity is being used because

different teams have different sized markets and in

turn different sized stadiums, so it is not in the interest

of this experiment to give a significant advantage to

teams with larger stadiums by simply using the raw

attendance numbers.

The computer science technique of data mining

and predictive modeling and is being used for the

analysis. The data mining process involves

discovering interesting and useful patterns and

relationships in large volumes of data (Marchi, 2010).

By applying predictive modeling techniques to

sports, this research contributes to a better

understanding of underlying factors that govern

human behavior associated with sports followings.

The programming language JAVA is used to create

and run the regression models for the data.

This genre of scientific research falls under the

relatively new area of “Sports Data Mining”. This

area has experienced rapid growth in recent years

(Baker and McHale, 2013; Hamadani, 2006; Stekler,

2007). Sports organizations are keen to find more

practical methods to extract valuable knowledge

using data mining techniques (Lewis, 2003; Silver,

2012). By finding the right ways to make sense of

data and turning it into actionable knowledge, sports

organizations have the potential to secure a

competitive advantage over their peers. Professional

sports organizations are multi-million dollar

enterprises with millions of dollars spent on a single

decision. With this amount of capital at stake, just one

bad or misguided decision has the potential of setting

an organization back by several years. With such a

huge risk at stake and a critical need to make good

decisions, the sports industry is an attractive

environment for applications of data mining (Boulier

and Stekler, 2003; Sinha and others, 2013;

Schumaker and others, 2010).

2 HYPOTHESIS

IF a team has a greater winning record, THEN the

team will have stronger or greater fan loyalty (in this

case, percentage of stadium capacity filled during

home games) BECAUSE the team will be more

enjoyable to watch and will attract greater attention in

its local community, leading to new fans joining the

fan-base, and former fans coming back as well. More

people will tune in to the team on television, more

merchandise will be sold, and teams will have an

increased following on social media, in addition to

more people coming to the games.

3 DATA AND METHODOLOGY

Data for this research comes mainly from websites

such as NFL.com and ESPN.com. JAVA

programming is used for analysis. Eclipse (an

integrated development environment for

programming Java) is used to code the program used

for outputting the results of the experiment. In

correspondence with Eclipse, certain Java libraries

found online were used to help with the coding. These

included Apache POI (used to read data from the

excel file), Apache Commons Mathematics (used to

help create the linear/simple regression), and

Princeton’s Algorithms and Clients (used to help

create the quadratic regression).

Here are the specific steps:

1. The JAVA development kit was downloaded

from Oracle’s website (http://www.oracle.com/) and

then installed

2. The IDE (integrated development

environment) Eclipse was downloaded and installed,

which was used for the actual coding of the regression

models

3. The data was read from the Excel file using the

Apache POI library and was subsequently stored in a

two-dimensional matrix

4. The 2-dimensional matrix was then inputted

into a Simple Regression call, using the Simple

Regression class in Apache Commons Mathematics

library (commons.apache.org/math/)

5. The results of the linear (or simple) regression

model were then outputted.

6. The matrix was split in to 2 separate arrays of

data, one for the x-values (Wins in a season), and one

for the y-values (average percentage of home

attendance in terms of stadium capacity)

7. This was inputted into a Polynomial

Regression call, using the Polynomial Regression

icSPORTS 2015 - International Congress on Sport Sciences Research and Technology Support

258

class found in Princeton’s Algorithms and Clients

(algs4.cs.princeton.edu/code/), and

8. The results of the quadratic regression were

outputted as well.

The appendix details the Java Programming used.

There are two main things that are measured- team

success and fan loyalty. The correlation between

success and fan loyalty is examined using predictive

modelling approach controlling for various factors.

The regression equation estimated to reflect this is:

= a + bX

t-1

+e (1)

Y (Dependent Variable) is the attendance for a team

as a % of stadium capacity in Season ‘t’;

X (Independent Variable) is the number of wins for a

team in Season ‘t-1’;

b is the change in average attendance due to change

in previous season’s winning record;

a is the constant; and

e is the error in predicting the attendance, given the

team record.

For the winning record, the preceding season (t-1)

is the one that matters since the ongoing season would

not be expected to make a significant impact on ticket

sales, as for the most part tickets are sold prior to the

season beginning, with most in-season ticket

transactions being between people who already have

tickets (or are season-ticket holders) to non-ticket

holders and other general public.

The coefficient ‘b’ is of interest for the purpose of

this paper as it indicates the change in ticket sales as

a percentage of stadium capacity that results from

change in the number of wins in the previous season.

While equation (1) suggests a proportionate change in

attendance with respect to wins, it is quite plausible

that attendance is more responsive to wins for teams

with losing records than winning records. In such a

case a quadratic formulation is needed to test a

possible non-linear relationship between the number

of wins in the previous season and team attendance in

a particular season. The regression equation estimated

for the quadratic formulation estimated is:

= a + bX

t-1

+ (bX

t-1

)

+ e (2)

4 RESULTS AND ANALYSIS

The analysis is done for all of the 32 NFL teams over

a period of 8 years for which the data is available.

The home attendance varies from a minimum of 70.3

percent of capacity to 116.5 percent of capacity. It is

important to clarify here why a stadium capacity

would exceed its maximum limit, i.e. 100 percent.

Stadium "capacity" is measured in terms of seats.

This number is set and seldom changes if the stadium

is specially built for the team in question. What then

happens is seats are added- usually in less desirable

places or standing-room only tickets are sold. In

addition, luxury boxes which have stated occupancy

numbers could also contribute to stadium capacity,

but are not closely monitored during events. So it is

becomes possible for stadiums to sell tickets in excess

of the official capacity.

Figure 1 depicts the fitted values compared to the

scattered real observations and linear regression

results are reported in Table 2. The results suggest

that a team’s winning record in the previous season

significantly impacts the attendance for home games.

A “t” statistic of 5.93 at 95 percent confidence

interval attests to this significance. Interpreting the

coefficient of independent variable one can say that

each additional win in the previous season leads to

0.72 percent or almost 1 percent increase in current

stadium attendance as a percent of its capacity.

Figure 1: Graphing of Fitted Values using Linear

Formulation.

For example, if on average the stadium attendance is

85 percent of its capacity it will increase to

approximately 86 percent for each additional win in

the previous season. It the total capacity is 100,000

then that would mean that for each additional win

1000 more people will come to the stadium. If a team

improves its record from 4 wins to 9 wins then

approximately 5000 additional people will show up

100

110

120

0 4 8 12 16

Average Attendance (Percentage of

Stadium Capacity)

Number of WIns in a Season

The Effect of Team Record on Fan Loyalty in the National Football League

259

Table 2: Linear Regression Results (Equation 1).

for the games. This is important from a business view

point for an owner or manager. If the average price of

a ticket is $125 then each win brings additional

revenue earnings of $125,000. If one were to include

parking fee and money spent on food and beverages

for each additional person attending the game, it will

add on average another $75 on average or $75,000 in

total. If a team improves its record by 5 games then

it could be looking at additional revenue of over $1

million.

The results of the quadratic regression are given in

Table 3. It suggests that attendance as a percent of

stadium capacity increases with number of wins but

at a decreasing rate. As it can be seen in Figure 2,

interestingly the impact of winning record on

attendance is much larger for losing team than

winning teams. While the attendance increases as the

winning record improves it increases at a decreasing

rate. For a team with 12 or more wins it does not

change much, and eventually, it even starts to fall a

little bit.

Figure 2: Graphing Fitted Values Using Quadratic

Formulation.

In other words, winning makes a big difference to the

attendance of a team with a losing record than with a

winning record, which is all the more reason for

teams with a losing record to improve their

performance.

Table 3: Quadratic Regression Results (Equation 2).

Independent

variable

Coefficient of

Independent

Variable

R-Square Constant

Wins in the

Previous Season

1.875553 0.13 87.08

Wins in the

Previous Season

Squared

-.0730872

5 CONCLUSIONS AND WAY

FORWARD

The aim of this paper is to examine how a team’s

performance affected its fan loyalty controlling for

other factors. The major finding here is that the

number of wins in a season has a significant impact

on the loyalty of fans. Also, one major trend found in

the quadratic formulization is that the rate of growth

of fan loyalty decreases as the number of wins

increases, showing that fan loyalty of a lower-level

team is affected a lot more with a win or a loss than

with higher-level teams.

The significance of the constant term suggests

that there may be many other factors, such as family

history (who one’s family supports), home-town

support, home-town income, price of tickets favourite

players, and so on.

Even if a team has a winning record there is

always a possibility that a team does not make it to

the playoffs. A way to extend the analysis would be

to see to what extent attendance will depend on

whether or not a team makes it to the playoffs or even

wins the Super Bowl.

Similarly, one can substitute television market

share for stadium attendance as the dependent

variable. Television market share is usually defined

as percentage of TV homes in that market with TV

“physically tuned” into the game. It also provides a

good barometer to gauge the fan intensity for the

home team. One can also potentially look at team

merchandise sales or social media popularity as

representations of fan loyalty once more data

becomes available.

100

110

120

0481216

Average Attendance (Percentage of

Stadium Capacity)

Number of WIns in a Season

Attendance as a % of Stadium Capacity

Independent

variable

Coefficient of

Independent Variable

Standard

Error

R-Square Constant F( 1, 254)

Wins in the

Previous Season

0.7257376 0.126 0.114 90.88 32.80

icSPORTS 2015 - International Congress on Sport Sciences Research and Technology Support

260

When the initial analysis is extended to test the

significance of playoffs on home game attendance, it

suggests a much bigger impact on team attendance

than a mere winning record. A playoff appearance

would increase the average attendance in relation to

the stadium capacity by 3.2 percent! This could off

course be studied more carefully in the future.

Similarly when the analysis is extended using

share of the local television market share, the results

suggest that each win in the current season increases

the share of the television market by 1.4 percent and

each win from the previous season increases TV share

by almost 1 percent. A playoff appearance in the

previous season on the other hand increases the share

of local TV market by almost 7 percent! Again this is

an area which could be studied in detail in the future.

The analysis can be extended also to look at jersey

and other team merchandise sales and social media

popularity.

Professional sports organizations are multi-

million dollar enterprises with millions of dollars

spent on a single decision. With this amount of capital

at stake, just one bad or misguided decision has the

potential of setting an organization back by several

years. The results therefore suggest that owners and

management along with coaches and players would

benefit significantly by putting together a winning

combination! Nothing more matters for the fans than

to see his/her team wins.

REFERENCES

Apache POI - the Java API for Microsoft Documents.

(n.d.). Retrieved January 2, 2015, from

http://poi.apache.org/

Baker, R.D. and I.G. McHale, 2013. Forecasting exact

scores in national football league games. International

Journal of Forecasting 29(1), 122–130 (2013).

Boulier, Bryan L. and H.O. Stekler, 2003. Predicting the

outcomes of National Football League games.

International Journal of Forecasting: 19 (2), 257-270

(2003).

Hamadani, B., 2006. Predicting the outcome of NFL games

using machine learning. http://cs229.stanford.edu/

proj2006/BabakHamadani-PredictingNFLGames.pdf.

Java Algorithms and Clients. (n.d.). Retrieved January 29,

2015, from http://algs4.cs.princeton.edu/code/

Lewis, M., 2003. Moneyball: The Art of Winning an Unfair

Game, NYork: W.W. Norton G. Company, page 75/77.

Marchi, Leonardo De, 2011. Data Mining of Sports

Performance Data. Erasmus computing 2010/2011

http://www.comp.leeds.ac.uk/mscproj/reports/1011/de

_marchi.pdf

Math- Commons Math: The Apache Commons Mathematics

Library. (n.d.). Retrieved January 14, 2015, from http://

commons.apache.org/proper/commons -math/

Schumaker, R.P., O. K. Solieman and H. Chen, 2010.

Sports Data Mining, Integrated Series in Information

Systems, Volume 26, 2010. Springer.

Silver, Nate, 2012. The Signal and the Noise: Why So Many

Predictions Fail — but Some Don't. New York: The

Penguin Press.

Sinha, S., C. Dyer, K. Gimpel, and N.A. Smith, 2013.

Predicting the NFL Using Twitter. Carnegie Mellon

University, Pittsburgh PA 15213, USA.

Stekler, H. O., 2007. Sports Forecasting, Working Papers

2007-001, The George Washington University,

Department of Economics, Research Program on

Forecasting, revised Jan 2007.

The Effect of Team Record on Fan Loyalty in the National Football League

261

APPENDIX

Java Programming

icSPORTS 2015 - International Congress on Sport Sciences Research and Technology Support

262