Adding Time and Subject Line Features to the Donor Journey

Greg Lee

, Ajith Kumar Raghavan

and Mark Hobbs

Jodrey School of Computer Science, Acadia University, 27 University Avenue, Wolfville, Canada

Fundmetric, 1526 Dresden Row, Halifax, Canada

Keywords:

Machine Learning, Time Series Data, Charitable Giving.

Abstract:

The donor journey is the path a charitable constituent takes on their way to making a donation. Charities

are moving towards more electronic communication and most appeals are now sent via email. The donor

journey can be followed electronically, monitoring constituent and charity actions. Previous research has

shown that it is possible to use past actions of a donor to predict their next gift within $25. We build on this

research by adding new features that capture the time between actions, as well as new email features, including

subject lines features in such a way as to isolate their effect on model accuracy. These additions show a small

improvement in accuracy of recurrent neural network models for most charities, showing these features do

indeed help deep learning methods understand the donor journey.

1 INTRODUCTION

On a day to day basis, the most asked question within

charities is typically “What do we do next?”, and this

question is asked about individual constituents (peo-

ple in the charity’s database). The goal of a charity is

to maximize how much money it raises in the long

term, so simply answering this question with “ask

them for money” is an oversimpliﬁcation, since con-

stituents are not going to give gifts to a charity every

day. Instead, the charity might want to send a thank

you letter (an example of stewardship) or wait and let

the constituent take an action on their own, such as

visiting the charity’s website.. While sending a solic-

itation email is likely the right action at some point

along the donor journey, it is not necessarily the right

action at any given time.

The donor journey is the set of chronological ac-

tions a constituent and a charity take while the con-

stituent interacts with a charity. These include the

charity sending emails and the constituent opening

these emails, clicking links, and of course, donating

to the charity. Charities seek to optimize the donor

journey by performing the right actions at the right

times to maximize a donor’s lifetime value (i.e., max-

imize how much money the donor gives to the char-

ity). This is best achieved not only by maximizing do-

nations, but reducing costs and donor fatigue. All ap-

peals have associated costs which must be subtracted

from the revenue in order to calculate the actual gain

for the charity. In addition, when donors are asked

for money too often, they can become less likely to

donate in the future (Canals-Cerda, 2014).

Previous research on the donor journey (Lee et al.,

2022; Lee et al., 2020b) has shown that a constituent’s

donation amount can be estimated within a $25 mean

absolute error (MAE) using deep learning methods on

a chronological set of actions described by features

of their associated email. Sample data could be the

action opened with associated email features of 311

words, 3 paragraphs, and 2 variables. Window sizes

varied between 1 and 25 for these experiments. Here

the window size is how many past actions the deep

learning algorithm is allowed to consider when trying

to predict the donation amount.

We extend the work of the authors of (Lee et al.,

2022; Lee et al., 2020b) by adding time and email

subject line, and other email features for machine

learning in order to see if these features can help deep

learning algorithms improve their accuracy in terms

of predicting which sequence of actions will generate

the greatest lifetime value across a database of con-

stituents.

As the authors did in (Lee et al., 2022; Lee et al.,

2020b), we focus on email appeals and a sample char-

itable email as shown in Figure 1. Here, a university

foundation makes an appeal to members of the univer-

sity community (alumni, faculty, staff, and friends) in

an effort to help students adversely affected by the

COVID-19 pandemic. This email was one of 27 sent

Lee, G., Raghavan, A. and Hobbs, M.

Adding Time and Subject Line Features to the Donor Journey.

DOI: 10.5220/0011620900003393

In Proceedings of the 15th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2023) - Volume 2, pages 45-54

ISBN: 978-989-758-623-1; ISSN: 2184-433X

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

over an 8 month period and statistics were gathered

concerning how often the email got to constituents,

how often they opened it, and various other actions

we describe later. Note that many of the 27 emails

were sent to very speciﬁc groups of constituents (e.g.,

foundation board members or those who did not open

a previous email) so no constituent received more than

a few emails concerning this cause.

Since the only action a charity can take within an

email campaign is to send an email, we also query

the most accurate models found with a range of email

parameters and observe which parameters values are

most commonly regarded as those that will lead to

higher donations. Given that some of the features we

use were not used in previous work, charities now will

have suggestions for email parameters they could not

access previously, such as how many words to put in

a subject line and how many font colours to use.

The rest of the paper is organized as follows. We

next describe related research, followed by formulat-

ing the problem. Following this, we describe our ap-

proach and describe our experimental setup. The pa-

per concludes with empirical results and conclusions

and future work.

2 RELATED RESEARCH

In this research, we learn to model the donor jour-

ney by predicting donation amounts based on past

actions and query that model to select best actions

and email parameters. This is relatively new ﬁeld

and little work has been done prior to this research.

Most donor journey advice amounts to bullet points

on websites (McLellan, 2022).

A few articles have investigated direct mail con-

tent through trials with Red Cross mailings. The au-

thors found that enrollment cards lead to repeat do-

nations, while providing donors with gifts hurt re-

tention (Ryzhov et al., 2015). Email campaigns are

generally evaluated based on demographics, interest

and social network inﬂuence of constituents and ex-

ternal time-related factors using best practices shared.

While for-proﬁt organizations have been using ma-

chine learning models for predicting the customer

journey (Lemon and Verhoef, 2016), charities are

having trouble adapting to these techniques.

Machine learning has been applied to predicting

donations to charities. In (Lee et al., 2022; Lee et al.,

2020b) the donor journey is studied extensively in

terms of adding constituent features, experimenting

with multiple deep learning algorithms, combining

data across charities We build on this research in this

paper. Machine learning has also been applied to in-

Figure 1: A sample email solicitation, redacted to maintain

anonymity.

dividual predictions, such as “who is lapsed, but most

likely to return”. These predictions provide charities

with lists of constituents whom they can act on in

a uniform matter, since machine learning algorithms

predict they will all take a given action. These are

point in time predictions that ignore the time aspect

of the donor journey (Lee et al., 2020a).

Recurrent neural networks (RNNs) are special-

ized artiﬁcial neural networks that can use sequential

data to learn based on chronological events. They

can make use of long short-term memory (LSTM)

in order to keep track of relevant events from the

past while discarding less important events, through

learning. RNNs have been used to predict customer

churn, which is related to lapsed donors in charitable

giving (Sudharsan and Ganesh, 2022). Bidirectional

RNNs with LSTMs (BDLSTMs) can consider time

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

series data in either direction, essentially looking both

forward and backward with two separate RNNs, using

backpropagation (Moolayil, 2019).

Convolutional neural networks (CNNs) are widely

used on image and video classiﬁcation, but can be

used on sequential data as well (Xia and Kiguchi,

2021). CNNs can extract features from sequential

data and map the internal features of the sequence

to the previous layer for each convolutional layer in

the model. This network is effective for deriving

features from a ﬁxed length segment of the overall

dataset (Kim, 2014). CNN LSTMs were developed

for predicting visual time series problems as well as

to generate text from sequence of images (Wang et al.,

2018). They have been used in various time series

prediction tasks, such as predicting air quality (Yan

et al., 2021).

In our work, we experiment with each of RNNs,

BDLSTMs, CNNs, and CNN LSTMs in order to see

which algorithm can produce the best models for pre-

dicting the next best action to take in the donor jour-

ney, in terms of maximizing donations, given the ad-

dition of time and new email features.

3 PROBLEM FORMULATION

In general, the issue at hand is to help charities raise

more money. This can be done on a point-in-time

scale, by asking questions such as “who is likely to

upgrade to a $500 gift?” and creating training data

for machine learning algorithms based on constituents

features at that given time. This would include fea-

tures such as “maximum donation” and “number of

emails opened”. We can make use of these features in

our work, but focus more on the order of constituent

and charity actions, and on what action to take next

rather than which behaviour a constituent is likely to

exhibit at some arbitrary point in the future.

The donor journey can be modeled by considering

the past n actions of the constituent and the charity

with respect to that constituent, and using that infor-

mation to try to predict the next best action for the

charity or constituent to take. Charities generally do

this by hand with “common sense” rules, such as “do

not send another solicitation email until 3 months af-

ter receiving a donation”. While many of the rules

charities use likely work in many situations, we seek

to eliminate bias and error from the process of under-

standing the donor journey, and use machine learning

to arrive at data-driven rules for charities to follow, on

an individual basis.

The actions considered in this and previous work

in this area are shown in Table 1. For one of the chari-

Table 1: The list of all actions used in our experiments for

every charity.

Action Description

No Action Filler action when none happened

Delivered Successfully delivered email

Opened The constituent opened an appeal email

Pageview The constituent viewed the donation portal

Donated The constituent made a donation

Clicked The constituent clicked on a email link

Complained The constituent reported an appeal email as spam

Dropped The appeal email did not reach the constituent

Bounced The appeal email was blocked by the constituent

Unsubscribed The constituent unsubscribed from a mailing list

Table 2: The list of all actions used in our experiments ex-

clusively for a university foundation (C5).

Action Description

Virtual Response Made a social media comment

Attended Attended a university event

Prospect Visit A major gift ofﬁcer visited the constituent

Volunteer Member Volunteered for a foundation committee

Purchased Purchased an event ticket

Recurring signup Signed up for the same activity 2+ times

Volunteer General volunteering

Participant Participated in an advisory circle

Staff Foundation staff action

Current Continued volunteering

Mentors Participated in accelerator mentoring

Ex Ofﬁcio Historic trustee action

Trustee Term Alumnus trustee action

Participating Host Off-campus event

Suppressed Unknown email error

Opt out Opted out of some email options

Failed Email did not reach constituent

Trustee Liaison Relevant board participation

Former Former board member action

ties used extensively in our experiments, extra actions

were available in the data, shown in Table 2. Note

that all actions are taken by the constituent except for

the “delivered” action, which is taken by the char-

ity. Since we are working with email campaigns, all

actions in this experiment have an associated email,

except for the ’no action’ action, for which all email

features are set to 0. ’No action’ is necessary to pad

donor journeys that are not n actions long (i.e., a con-

stituent who is newer to the charity than most con-

stituents), and since no action was taken, there cannot

be an associated email. The email features used both

in this work and previous work in this area are given

in Table 3.

Table 3: Variable email parameters.

Parameter Description.

Words Number of words

Paragraphs Number of paragraphs

Images Number of images

Links Number of HTML links

Blocks Number of sections

Divs Number of HTML content division elements

Editable Content Divs Number of editable HTML divs

Adding Time and Subject Line Features to the Donor Journey

4 OUR APPROACH

To model the donor journey, we use deep learning

algorithms that are sensitive to time steps, in order

to take advantage of the chronological aspect of the

donor journey. Opening an email and donating is not

the same as donating and then opening an email, and

thus we sought algorithms that can recognize this dif-

ference. An example of a sequence of actions lead-

ing to a donation would be a constituent receiving

an email, opening it, opening it again, clicking a

link in that email, clicking another link in that email,

and viewing the donation portal, which would be

registered as {Delivered, Opened, Opened, Clicked,

Clicked, Pageview}. To each of these actions, we add

the corresponding email features.

To build training data, each action is encoded as

a one-hot encoded action with corresponding email

features. In the experiments, this data is augmented

with the features that we describe in Table 4 in order

to observe the effect of the new features on model ac-

curacy. An example of a set of actions is shown in

Figure 2.

In Figure 2 a window of 6 actions is used and se-

lect actions and email features are given for space rea-

sons. The top of the ﬁgure shows the sequence of

actions with the donation amount ($150), while the

bottom shows the one-hot encoded actions with ap-

pended email features. The ﬁrst action (A1) is the

“no action” action and thus has no associated email

features, which is why they are all 0 in the ﬁgure.

The second action is a true action (A2), and has as-

sociated email features. Here, E1 could be 220 words

and E4 could be 11 images. The third action is the

same action as the second action (A2) and has the

same associated email features. This could be a case

where the constituent opened an email twice in a row.

The fourth action (A3) is a different action from the

second and third action and has a different associated

email. The ﬁfth action has the same associated email

as the fourth action, but is a different action (A10).

Finally, the sixth action is the same as the second and

third actions (A2), but has different email parameter

values, so this could be the constituent opening up a

different email.

For all experiments, the deep learning algorithms

are provided with training data in the form of Fig-

ure 2. Window sizes vary between 1 and 25. While

CNN LSTMs were the best performing algorithm in

previous research (Lee et al., 2022; Lee et al., 2020b),

we experiment with RNNs, BDLSTMs, and CNNs as

well to see the effect of the new features on these al-

gorithms, and to see whether the new features can ac-

tually improve their accuracies to the point of match-

Figure 2: A sample set of six actions used as training data

for deep learning models. The actions are one-hot encoded

and have corresponding email features appended.

Table 4: The new features added to training data. These are

divided into time features, subject line features, and email

features for clarity.

Feature Description

Time Features

Time Since Last Action Time in seconds since last action

Time Since Last Same Action Time in seconds since last same action

Subject Line Features

Subject Line Characters Number of chars in the subject line

Subject Line Words Number of words in the subject line

Subject Line Variables Number of variables in the subject line

Email Features

Special Characters Number of special chars

Font Colours The number of fonts

Background Colours The number of background colours

ing or surpassing that of the CNN LSTMs. In partic-

ular, since RNNs and BDLSTMs are more suited to

sequential data, we consider it important to continue

to evaluate their performance on this task.

Donor actions are given in sequence, but the ele-

ment of time is missing. In previous work, if action

B followed action A, there could have been years be-

tween these actions, or seconds, and the data did not

provide any information to allow the machine learn-

ing algorithms to distinguish between these situations.

We introduce two new features - time since last ac-

tion, and time since last same action. Time since last

action measures the time between the current action

and the previous action, while time since last same ac-

tion measures how much time has elapsed since an ac-

tion of the same type was taken (e.g., if the last action

was ’delivered’, how long it has been since the pre-

vious ’delivered’ action). We hypothesize that these

features could provide crucial information about the

meaning of actions following each other in the donor

journey. Figure 3 shows the data from Figure 2 aug-

mented with these two features.

In Figure 3 we give the times in a readable format,

but they are given as seconds in the training data. The

deep learning algorithms are now given information

about how long it has been since actions were taken.

While the times given are fabricated, we can see how

the sixth action happened 2 days after the last same

action (the third action) by summing 1 day, 6 hours,

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

Figure 3: The action set from Figure 2 with TA (Time since

last action) and TSA (Time since last same action) added.

and 18 hours between them in terms of time since last

action. We can also see that the third action happened

20 minutes after the second action, but it was an en-

tire day later that the fourth action happened. Also,

the constituent has not taken A3 action in a year un-

til they do so on their fourth action. Without time

features, all of this information is not available to ma-

chine learning algorithms.

With emails, before the reader sees the email mes-

sage, they typically see the subject line. Subject

lines have been shown to affect response rates to sur-

veys (Sappleton and Lourenc¸o, 2016). We add 3 sub-

ject line features to the set of features describing an

email associated with an action (Table 4. Subject line

characters and subject line words are related, but it is

possible to have many short words, or few long words

and have the same number of characters. Subject line

characters gives a measure of the overall length of the

subject line, while subject line words help the algo-

rithms know how split up these characters are. Subject

line variables count how many variables there are in

the subject line, where a variable is a placeholder for a

value that can be personalized (e.g., the constituent’s

ﬁrst name in “Hi (ﬁrst name), we want to thank you

for your donation!”).

We also add the following email features – spe-

cial character count, font colours, and background

colour to investigate their effect on the donor journey.

Special character count counts how many “$” and

“&” characters are in the email. This can be thought

of primarily as how often money is mentioned, but

also how often the constituent is seeing symbols in

the email. Font colour counts the number of fonts

colours used in the email. Some emails have only

one font colour (black) but many fundraising emails

have several, in order to try to make the email ap-

pear more inviting and exciting. A similar story holds

for background colour, where often the only back-

ground colour is white, but some emails have many

backgrounds throughout, again to try to appear more

enticing to the reader.

Architectures for the deep learning algorithms

were kept the same as the best architectures in (Lee

et al., 2022; Lee et al., 2020b) for consistency in terms

of comparison, and because the authors stated that

Figure 4: The best deep learning architectures discovered

empirically and used in our experiments. From top to bot-

tom they are: CNN, CNN LSTM, RNN, and BDLSTM.

those architectures had been optimized for the data at

hand. The best deep learning architectures are shown

in Figure 4.

5 EMPIRICAL EVALUATION

Our experiments involve comparing the performance

of four deep learning algorithms (CNNs, CNN

LSTMs, RNNs, and BDLSTMs) using the features in

previous work to their performance with those fea-

tures + the features described in Table 4. We seek to

understand if these new features can help any charities

better model the donor journey for their constituents.

In order to try to isolate the effect of the new features

we are adding, we omit constituent features from the

data. In previous researchers work on this data (Lee

et al., 2022; Lee et al., 2020b) constituent features

generally helped lower MAE, but did not have a large

effect on it, so omitting them should not have a large

effect on the overall measurements. We also do not

combine data across charities, since for most chari-

ties this is not an option, since they do not have ac-

cess to other charities’ data. We believe this compar-

ison gives a more accurate assessment of whether the

features we are adding help deep learning algorithms

learn more accurate donor journey models.

All experimental results are averaged over 25

cross validation runs, where at each run we balance

donors and non donors. This is done by selecting

all of the smaller set (donors) and an equal random

number of the larger set (non-donors) for the training

set, and choosing a different subset of the larger set at

each subsequent iteration. While this is a regression

problem, it is as important for charities to distinguish

between donors and non-donors as it is to determine

how much a donor may give. Data is split into 75%

Adding Time and Subject Line Features to the Donor Journey

Table 5: Summary of training data from ﬁve charities.

C1 C2 C3 C4 C5

Donors 640 229 316 27 1258

Non-Donors 195669 195688 50811 60 173363

Total Raised $54,387 $55,952 $130,034 $6,285 $364,133

Mean Don. $85 $245 $409 $233 $290

Median Don. $50 $100 $100 $100 $100

Standard Dev. $105 $514 $1520 $585 $1,035

Min Don. $5 $1 $1 $10 $5

Max Don. $1,000 $5,000 $10,000 $3,000 $25,000

training and 25% testing.

5.1 Preliminary Experiments

The charitable data used in our preliminary experi-

ments is the same data used in previous research (Lee

et al., 2022; Lee et al., 2020b). This data is described

in Table 5. C1 is a wildlife charity, C2 is a disease

charity, C3 is a youth charity, C4 is a disease charity

and C5 is a university foundation. This data comes

from Fundmetric (www.fundmetric.com), a machine

learning platform that provides anonymized data that

mirrors the real world completeness of most data sets

for nonproﬁts.

Table 6: Preliminary experiment using all 5 charities, with

window size 20 and a CNN.

C1 C2 C3 C4 C5

Without Time Features $26 $490 $195 $59 $46

With Time Features $35 $290 $138 $57 $52

Tables 6 compares the MAE for the ﬁve charities

when time features are added to the action data to

the MAE without time features, using window size 20

for each charity, and the CNN algorithm. The MAE

for C2 and C4 saw a 41% and a 29% drop respec-

tively when adding time features to the data. This still

constitutes a $290 and $138 MAE for these charities,

which is too high of an error to use the model for sug-

gesting email parameters for those charities. For C1

and C5, where the MAEs were lower in previous ex-

periments, there was a slight increase in MAE with

the addition of time features.

In subsequent experiments, C1 and C5 were the

focus, in an attempt to further lower the MAE for

their donor journeys, since it was these charities for

which MAE was at more acceptable levels in terms

of a charity making decisions based on the results of

donor journey experiments, and since C5 has extra ac-

tions making it a different dataset on which to train.

We present the C2 and C3 results here to show that

when MAE is high, time features can be added in or-

der to move towards a more acceptable error. C4 has a

small data set (only 60 donors) and we did not include

it in further experiments as a result.

The following experiments show results for C1

and C5 when adding time features (Section 5.2), and

then adding the new email features (Section 5.3), us-

ing window sizes from 1 to 25. We experiment with

each of RNNs, BDLSTMs, CNNs, and CNN LSTMs

for each experiment, to see the effect of the newly

added features on their performance with respect to

MAE.

5.2 Experiment 1: Adding Time

Features

Tables 7, 8, 9, and 10 show the change in MAE when

the two time features are added to the data compared

to the MAE without these features for four deep learn-

ing algorithms. In all results tables, bold values show

the lower MAE in a comparison of data, and bold

italic values show the lowest value in the table for a

given charity.

For CNNs and CNN LSTMs, there is an increase

in MAE with almost every window size for both C1

and C5. On the contrary, for RNNs and BDLSTMs,

there is a decrease in MAE for most windows sizes for

both C1 and C5. Thus, the extra features seem to help

RNN-based deep learning algorithms. In addition, the

MAEs are generally lower for RNNs and BDLSTMs

than they are for CNNs and CNN LSTMs, suggesting

that adding time features and using RNN-based deep

learning algorithms is a charity’s best choice to ob-

tain the most accurate donor journey model. We next

experiment with adding subject line and other email

features to the time features data.

Table 7: Comparing the performance of CNNs with new

added time features to data without these features.“TF”

stands for time features being added to the data.

Window Size

1 3 6 10 15 20 25

C1 $32 $33 $26 $34 $29 $27 $27

C1TF $43 $42 $44 $43 $39 $41 $40

C5 $49 $49 $44 $49 $49 $50 $57

C5TF $52 $52 $46 $41 $52 $52 $50

Table 8: Comparing the performance of CNN LSTMs with

new added time features to data without these features.

“TF” stands for time features being added to the data.

Window Size

1 3 6 10 15 20 25

C1 $41 $41 $36 $35 $36 $42 $33

C1TF $41 $42 $43 $41 $40 $39 $39

C5 $49 $46 $47 $48 $47 $49 $46

C5TF $53 $52 $52 $53 $53 $52 $53

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

Table 9: Comparing the performance of RNNs with new

added time features to data without these features. ‘TF”

stands for time features being added to the data.

Window Size

1 3 6 10 15 20 25

C1 $43 $31 $30 $28 $27 $26 $27

C1TF $25 $24 $24 $24 $25 $24 $24

C5 $55 $55 $55 $54 $55 $50 $48

C5TF $44 $38 $40 $41 $40 $40 $41

Table 10: Comparing the performance of BDLSTMs with

new added time features to data without these features.

“TF” stands for time features being added to the data.

Window Size

1 3 6 10 15 20 25

C1 $43 $31 $29 $28 $27 $28 $25

C1TF $28 $25 $28 $24 $25 $26 $24

C5 $53 $52 $42 $41 $43 $40 $50

C5TF $39 $38 $48 $40 $40 $40 $38

5.3 Experiment 2: Adding Subject Line

Features and More Email Features

We next added new email features, including those

describing the subject line for the email associated

with each action. Tables 11, 12,13, and 14, show

the change in MAE when these features are added to

the data compared to the MAE without these features

for four deep learning algorithms.

As with adding just time features, for CNNs and

CNN LSTMs, there is an increase in MAE in with

almost every window size for both C1 and C5. On

the contrary, for RNNs and BDLSTMs, there is a de-

crease in MAE for most windows sizes for both C1

and C5. Thus, the extra features seem to help RNN-

based deep learning algorithms.

For CNNs the MAEs are lowered when adding

new email and subject line features to time features,

while CNN LSTMs are unaffected. For RNNs the ad-

dition of new email and subject line features actually

caused an increase in MAE compared to only adding

time features, although these MAEs were still lower

than not adding new features at all. For BDLSTMs,

adding the new email and subject features had mixed

results compared to only adding time features, but for

C5 it was generally better to have all of the new fea-

tures in terms of MAE.

Overall, the lowest MAE for C1 data was $24

which was achieved by both RNNs and BDLSTMs

with time features and with time features and new

email features. For C5, the lowest MAE was $36

which was achieved using a BDLSTM with both time

features and new email features. These MAEs are

lower than any achieved in previous work ( (Lee et al.,

Table 11: Comparing the performance of CNNs with new

features(time and new email features) to without new fea-

tures.“NewF” stands for new email and subject line features

being added to the data.

Window Size

1 3 6 10 15 20 25

C1 $32 $33 $26 $34 $29 $27 $27

C1NewF $40 $42 $42 $39 $40 $39 $38

C5 $49 $49 $44 $49 $49 $50 $57

C5NewF $52 $58 $52 $53 $52 $52 $51

Table 12: Comparing the performance of CNN LSTMs with

new features(time and new email features) to without new

features. “NewF” stands for new email and subject line fea-

tures being added to the data.

Window Size

1 3 6 10 15 20 25

C1 $41 $41 $36 $35 $36 $42 $33

C1NewF $41 $42 $42 $42 $42 $42 $41

C5 $49 $46 $47 $48 $47 $49 $46

C5NewF $51 $50 $52 $53 $54 $53 $53

2022; Lee et al., 2020b)) when constituent features

and data combination were not used and thus show

that the addition of time and new email features helps

create deep learning models better capable of captur-

ing the donor journey.

5.4 Experiment 3: Querying the Most

Accurate Models

In order to ensure the models learned were reason-

able, we queried them to see which actions they

thought would lead to the highest donation amount

across donor journeys. This involved taking the last

n actions associated with a constituent and shift-

ing them back a position, and querying the model

with each of the possible actions, as shown in Fig-

ure 5. Doing this resulted in the highest predicted gift

amount being associated with pageview, donated, and

delivered in 80% of cases and actions such as com-

plained never being predicted to produce the high-

est donation amount. We interpreted this to mean the

model had learned positive actions help increase do-

Table 13: Comparing the performance of RNNs with new

features(time and new email features) to without new fea-

tures. “NewF” stands for new email and subject line fea-

tures being added to the data.

Window Size

1 3 6 10 15 20 25

C1 $43 $31 $30 $28 $27 $26 $27

C1NewF $24 $24 $32 $24 $24 $24 $24

C5 $55 $55 $55 $54 $55 $50 $48

C5NewF $44 $39 $41 $40 $44 $46 $48

Adding Time and Subject Line Features to the Donor Journey

Table 14: Comparing the performance of BDLSTMs with

new features(time and new email features) to without new

features.“NewF” stands for new email and subject line fea-

tures being added to the data.

Window Size

1 3 6 10 15 20 25

C1 $43 $31 $29 $28 $27 $28 $25

C1NewF $25 $25 $30 $29 $24 $25 $24

C5 $53 $52 $42 $41 $43 $40 $50

C5NewF $36 $41 $43 $41 $41 $42 $41

Figure 5: When querying the models, all actions are shifted

back by 1 and a new action is inserted into the last posi-

tion. The model is then queried for a predicted donation

amount with this set of actions. Email and time features are

included, but are not shown here.

nations and as a further sanity check on its reasoning.

Ultimately, since we cannot choose the actions of

the constituent, we want to be able to optimize the ac-

tion the charity can take, which is sending an email

(delivered). We queried the most accurate models

from Experiments 1–3 with the delivered action with

a range of values for each email parameter, and ob-

served which values led to the model suggesting the

highest donation amount for each constituent. These

results are shown in Tables 15, 16 show the mode,

median, mean and standard deviation for the email

parameter values deemed by BDLSTMs to be the best

in order to maximize donation values for C1 and C5

constituents respectively. We used BDLSTMs with

window size 20 in this experiment since this setup

generally had the lowest MAEs in our experiments.

For C1, BDLSTMs suggest a larger number of

paragraphs, while for C5 they suggest just 1 para-

graph, perhaps picking up on a difference between the

two charities. C1 is a wildlife charity and C5 is a uni-

versity foundation and their donors may have differ-

ent email preferences. Another difference is shorter

subject line words for C1 vs longer ones for C5. For

both charities, the model chose 5 words, but averag-

ing their lengths using the number of characters se-

lected gives us 4 letter words for C1 and almost 8 let-

ter words for C5. Donors to C5 may need more infor-

mation in their subject lines in order to be sufﬁciently

interested to open an email.

In terms of special characters and fonts, BDL-

STMs suggested 5 each for both C1 and C5, but sug-

gested 15 background colours for C1 vs 5 for C5.

Similarly to the choice of larger words for C5, this

is perhaps indicative of the BDLSTM learning that

university foundation donors prefer larger words with

fewer colours. For both C1 and C5, BDLSTMs chose

Table 15: Summary of email parameter values chosen by

BDLSTMs for C1.

Mode Median Mean St. Dev

Words 150 150 134.43 132.47

Paragraphs 18 18 15.03 6.43

Images 15 15 13.54 3.32

Links 35 35 34.14 2.1

Blocks 9 9 9 0

Special chars 5 5 6.76 3.8

Font colours 5 5 5.14 1.18

Background colour 15 15 12.52 4.32

Divs 41 41 44.64 6.4

Editable Content 9 9 10.17 2.3

Subject line words 5 5 5.69 2.7

Subject line characters 20 20 22.53 5.74

Subject line variables 0 0 0.4 0.49

Table 16: Summary of email parameter values chosen by

BDLSTMs for C5.

Mode Median Mean St. Dev

Words 150 150 124.8 56.23

Paragraphs 1 1 1 0

Images 15 15 13.4 3.3

Links 35 35 33.9 2.2

Blocks 9 9 9 0

Special chars 5 5 8.23 4.7

Font colours 5 5 9.9 5.03

Background colour 5 5 9.9 5.03

Divs 56 56 53.4 5.6

Editable Content 15 15 13.05 2.8

Subject line words 5 5 5 0

Subject line characters 38 38 29.12 9.06

Subject line variables 0 0 0.05 0.23

0 subject line variables except for in a few cases, indi-

cating that having a constituent’s name in the subject

line is not advisable, according to the models.

6 CONCLUSIONS AND FUTURE

WORK

This research builds on the work in (Lee et al., 2022;

Lee et al., 2020b) by augmenting donor journey data

with time features and more email features, includ-

ing subject line features. The former provides needed

context for the passage of time between actions, and

the passage of time between similar actions, while the

latter provides more information about the initial sec-

tion of an email the reader sees (the subject line) and

about the appearance of the email.

The addition of these features had a strong effect

on two charities (C2 and C3), reducing the MAE by

41% for one charity and 34% for another. But these

charities had high MAEs to begin with, so we focused

on charities that had lower MAEs, including one that

had extra actions compared to the other charities with

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence

data available, since these models are accurate enough

to be used in practice.

Adding time features increased MAE for CNNs

and CNN LSTMs in most cases, as those algorithm

were perhaps less well-equipped to handle the time

features and did not seem to learn from the new

email features. In contrast, when time features were

added to the data for RNNs and BDLSTMs, the MAE

dropped, and was below that of the CNN and CNN

LSTMs on data without time features. This showed

adding time features can help deep learning achieve

lower MAEs on the donor journey.

We next added subject line features and new email

features to the data with time features and again com-

pared to data without any new features. The results

were similar as to when time features were added, al-

though CNN and CNN LSTM MAEs improved with

data containing these new features compared to their

MAEs when only having time features added. For

RNNs and BDLSTMs, there was not a signiﬁcant

change from only adding time features, but the lowest

MAEs were achieved with data having all of time fea-

tures, subject line features, and other new email fea-

tures. This shows that the new features added in these

experiments help create more accurate deep learning

models for the donor journey.

When querying the most accurate model to se-

lect email parameters, the BDLSTM model suggested

short emails for both C1 and C5, but many more para-

graphs for C1, which is a wildlife charity. It also

chose fewer background colours for C5, a university

foundation and larger words for its subject lines com-

pared to C1. This may reﬂect the level of language

sophistication around university donors, since many

of them are alumni of the university foundation, and

thus have a post-secondary education.

In the future, we will add in constituent features to

(hopefully) further lower MAE and see if deep learn-

ing algorithms can beneﬁt from having all of actions,

email features, constituent features, and all the fea-

tures we added in this paper. We will also combine

data across charities to see the effect, even though this

combination of data is not realistic for most charities.

We will continue to add new features to the data as

they become available and as we create them.

In addition to understanding which features matter

for machine learning the donor journey, understand-

ing why such features matter is a possible avenue of

research. For instance, background colours may have

different effects on constituents from different cul-

tures. We can also survey constituents to obtain di-

rect answers concerning which email features actually

made a different in their decision to donate or not, and

in their decision concerning how much to donate.

Also in the future, the type of email sent will be

a feature, which would be in the set of {acquisition,

solicitation, stewardship, cultivation}. Acquisition

emails seek donations from non-donors, while solic-

itation emails seek donations from previous donors.

Cultivation emails seek to increase a donor’s dona-

tion amount, while stewardship emails thank donors

for their donations and keep them informed on the

activities of the charity. While adding these features

may seem straightforward, many emails ﬁt more than

one category. Charities always try to say thank you

even when asking for money, so these features will

likely need to be scaled in the [0,1] range and we will

experiment to see which system works best for incor-

porating this information into the data.

REFERENCES

Canals-Cerda, J. J. (2014). Charity art auctions. Oxford

Bulletin of Economics and Statistics, 76(6):924–938.

Kim, Y. (2014). Convolutional neural networks for sentence

classiﬁcation. In Proceedings of the 2014 Conference

on Empirical Methods in Natural Language Process-

ing (EMNLP), pages 1746–1751, Doha, Qatar. Asso-

ciation for Computational Linguistics.

Lee, G., Adunoor, S., and Hobbs, M. (2020a). Machine

Learning across Charities. Proceedings of the 17th

Modeling Decision in Artiﬁcial Intelligence Confer-

ence, page in press.

Lee, G., Raghavan, A., and Hobbs, M. (2022). Deep learn-

ing the donor journey with convolutional and recurrent

neural networks. In Wani, M. A., Raj, B., Luo, F., and

Dou, D., editors, Deep Learning Applications, Volume

3, pages 295–320. Springer.

Lee, G., Raghavan, A. K. V., and Hobbs, M. (2020b). Im-

proving the donor journey with convolutional and re-

current neural networks. In Wani, M. A., Luo, F., Li,

X. A., Dou, D., and Bonchi, F., editors, 19th IEEE In-

ternational Conference on Machine Learning and Ap-

plications, ICMLA 2020, Miami, FL, USA, December

14-17, 2020, pages 913–920. IEEE.

Lemon, K. N. and Verhoef, P. C. (2016). Understanding

Customer Experience Throughout the Customer Jour-

ney. Journal of Marketing, 80(6):69–96.

McLellan, T. (2022). Mapping the donor jour-

ney - part one: Five reasons to consider it.

http://www.ﬁnelinesolutions.com/academy/blogs/18-

non-proﬁts-learn-here/129-mapping-the-donor-

journey-part-one-why-is-mapping-a-good-idea.html.

Moolayil, J. (2019). Keras in Action. In Learn Keras for

Deep Neural Networks, pages 17–52. Apress, Berke-

ley, CA.

Ryzhov, I., Han, B., and Bradic, J. (2015). Cultivating disas-

ter donors using data analytics. Management Science,

62:849–866.

Sappleton, N. and Lourenc¸o, F. (2016). Email subject lines

and response rates to invitations to participate in a

Adding Time and Subject Line Features to the Donor Journey

web survey and a face-to-face interview: the sound

of silence. International Journal of Social Research

Methodology, 19(5):611–622.

Sudharsan, R. and Ganesh, E. N. (2022). A swish rnn based

customer churn prediction for the telecom industry

with a novel feature selection strategy. Connection

Science, 34(1):1855–1876.

Wang, H., Yang, Z., Yu, Q., Hong, T., and Lin, X. (2018).

Online reliability time series prediction via convolu-

tional neural network and long short term memory for

service-oriented systems. Knowledge-Based Systems,

159.

Xia, J. and Kiguchi, K. (2021). Sensorless real-time force

estimation in microsurgery robots using a time se-

ries convolutional neural network. IEEE Access,

9:149447–149455.

Yan, R., Liao, J., Yang, J., Sun, W., Nong, M., and Li,

F. (2021). Multi-hour and multi-site air quality in-

dex forecasting in beijing using cnn, lstm, cnn-lstm,

and spatiotemporal clustering. Expert Syst. Appl.,

169:114513.

ICAART 2023 - 15th International Conference on Agents and Artiﬁcial Intelligence