
2 RELATED RESEARCH
Most of the past research done predicting donor giv-
ing behavior makes use of linear techniques. Con-
nolly and Blanchette (1986) (Michael S. Connolly,
1986) used discriminant analysis, and Gerlinda Mel-
chiori (1988) (Melchiori, 1988) used classification
analysis to predict donor behavior, both of which are
types of linear regression. These techniques are in-
appropriate when the object is to predict rare events
(such as giving over $10,000) or when the dependent
variable has an upper or lower bound and there are a
large number of individuals at the bound (as with giv-
ing, where there are numerous individuals with zero
giving).
Brittingham and Pezzullo note that certain current
characteristics of alumni were found to be predictors
for major gift giving in some studies, but not oth-
ers (Brittingham and Pezzullo, 1990). Income, age,
number of degrees from the institution, emotional at-
tachment to the school, participation in alumni events,
and participation in and donation to other voluntary
and religious groups were found to be predictors.
Wesley and Christopher (1992) used logit analy-
sis in 1992 to predict the individuals who would give
higher (e.g., $100,000) or lower ($1,000) donations
based on the data from the alumni database as well as
the geo-demographic information (Winship and Lin-
dahl., 1992). Their result showed that 92% of the
dollars could be collected with 36.5% prospects se-
lected in the annual fund model. Later with their up-
graded model (1994) (Lindahl and Winship, 1994),
a slightly better performance was achieved for major
gift prediction. In this research, the test results using
deep learning models showed accurate results when
using large data sets for certain fundraising institu-
tions, compared to some shallow learning models as
described in empirical studies.
3 PROBLEM FORMULATION
The business problem at hand is to generate a ranked
list of constituents who have never given a major gift
as prospects, so that MGOs
1
can focus their time and
effort on them. To do so, we solve the problem of de-
termining which machine learning algorithm can best
learn to distinguish between major donors and non-
1
Fundraising institutions employ major gift officers
(MGOs) to seek out and ‘convert’ major donor prospects.
These MGOs can spend years developing a relationship
with potential major donors and thus the decision con-
cerning with whom to begin a relationship is an important
one (Gift’s, 2021).
major donors and then use that algorithm to predict
future major donors.
The process of securing a major gift generally
takes over a year, and involves several touch points
from the MGO. Typically, an MGO meets in person
with a major gift candidate on several occasions be-
fore a gift can be secured. This differs from non-
major gifts where there is generally just one touch -
an email, phone, or direct mail solicitation. Fundrais-
ing institutions must be aware of their cost per dol-
lar raised, so when an MGO spends fundraising in-
stitution money and time on a prospect, the prospect
must have the potential to give a large gift. Thus,
it is imperative that the model ordering the major
donor prospects be accurate, since so much time (and
money) will be spent with each prospect.
The data used in the experiments is provided
anonymously by Anonymous, an Anonymous-based
company whose objective is to help non-profit orga-
nizations raise more money by focusing on turning
one-time donors into lifetime supporters. Anonymous
works with organizations such as universities and dis-
ease related fundraising institutions. They create per-
sonalized emails and develop donor profiles based on
their interaction with the software. This approach
generates a huge amount of data, which is provided to
machine learning algorithms to help achieve the ob-
jective of this research.
The major donor data generated by Anonymous
is based on constituent interaction with fundraising
institutions. For our experiments, we collected data
from 8 fundraising institutions as shown in the Ta-
ble 1.
Table 1: Data sets from 8 fundraising institutions from 3
verticals (disease, education, religious).
Representation Type of FI’s
AlzF Alzheimer’s FI
CF Cancer FI
EF-1 Educational FI 2
EF-2 Educational FI 2
EF-3 Educational FI 3
EF-4 Educational FI 4
RF-1 Religious FI 1
RF-2 Religious FI 2
These data sets have far fewer major donors than
non-major donors as seen in Table 2. This means
the major donor data is heavily skewed towards non-
major donors and must be balanced before training a
model (Lee et al., ). Note that fundraising institutions
EF-1, EF-2, EF-4 and RF-1 had significantly more
major donors than the other 4 fundraising institutions
and we focus our attention on these. We examine the
Predicting Major Donor Prospects Using Machine Learning
463