TURNING ARTIFICIAL NEURAL NETWORKS

INTO A MARKETING SCIENCE TOOL

Modelling and Forecasting the Impact of Sales Promotions

Ibrahim Zafar Qureshi

Institute Risk Analyst, Allied Bank Limited, Karachi, Pakistan

Marwan Khammash, Konstantinos Nikolopoulos

Division of Business Studies, Bangor Business School, Bangor, Gwynedd, U.K.

Keywords: Forecasting, Promotions, Marketing, Artificial neural networks.

Abstract: In this study we model the effect of promotions in time-series data and we consequently forecast that

extraordinary effect via Artificial Neural Networks (ANN) as implemented from the Expert Method of a

popular Artificial Intelligence software. We simulate data considering five factors as to determine the actual

impact of each individual promotion. We consider additive and multiplicative models, with the later

presenting both linear and non-linear relationships between those five factors; in addition, we superimpose

either low or high levels of noise. Our empirical findings suggest that, for nonlinear models with high level

of noise, ANN outperform all benchmarks. Standard ANN topologies work well for models with up to two

factors while the Expert method provided by the software works well for higher number of factors.

1 INTRODUCTION

Marketing modelling has used over the years many

techniques, methods and applications derived from

management science and psychology among other

sciences. However, the complexity and richness of

marketing science data make them an ideal

candidate for analysis on the hands of Artificial

Intelligence techniques, especially in the 21

century

where computational power made such an exercise

feasible.

In this preliminary study we first model the

additive effect of promotions in time-series data

considering five factors: Budget, Duration, Media,

Perceived Benefit and Price Change, while the order

of the factors does no indicate any hierarchy of

importance. Consequently we forecast that

extraordinary effect via Artificial Neural Networks

(ANN) is implemented from the Standard and

Expert Method of a popular Artificial Intelligence

software. We simulate data considering both

additive and multiplicative models, with the later

presenting both linear and non-linear relationships

between those five factors; in addition, we

superimpose either low or high levels of noise.

We believe this is a realistic representation of

field-data for promotional activity and aspire in

further research to replicate and corroborate the

findings of this study via empirical evidence on real

data.

The rest of the study is as follows: a short

description of the problem under consideration is

provided, followed by the presentation of the

simulated data. Section four provides the empirical

findings while the last section communicates the

main findings of this research.

2 THE PROBLEM

Sales promotions are short-term incentives used to

increase sales of products. Spending on promotions

represents a major share of marketing budgets for

most consumer goods. Many products are sold today

with a percent of sales volume focused on ‘deals’.

Moreover, the use of promotions is spreading to

other marketing situations. Pharmaceutical

companies often offer drug stores discounts and free

goods; durable manufacturers (automobiles, TV sets,

phones, etc) use discounts and industrial

698

Qureshi I., Khammash M. and Nikolopoulos K..

TURNING ARTIFICIAL NEURAL NETWORKS INTO A MARKETING SCIENCE TOOL - Modelling and Forecasting the Impact of Sales Promotions.

DOI: 10.5220/0003292306980702

In Proceedings of the 3rd International Conference on Agents and Artiﬁcial Intelligence (ICAART-2011), pages 698-702

ISBN: 978-989-8425-40-9

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

manufacturers offer temporary price reductions to

their distributors.

A wide body of literature has focused on

understanding consumer response to the retailers’

promotion. Some researchers developed individual

choice models to measure the impact of promotions

on consumer choice (Kuehn and Rohloff 1967,

Ehrenberg 1972, Guadagni and Little 1983). Some

research has dealt with consumer stockpiling and

purchase acceleration to explain promotion sales

patterns (Shoemaket 1979, Battberg, Eppen and

Lieberman 1981).

Other researchers have considered brand

switching and the impact of promotions on repeat

purchases (Shoemaker and Shoaf 1977, Dodson,

Tybout and Sternthan 1978). A number of studies

looked at promotion response as consumer

segmentation variables (Blattberg and Sen 1976,

Blattberg, Buesing, Peacock and Sen, 1978)

3 THE DATA

In this study we used simulated data, constructed in

such a way so as to be as close as possible to real-

life data in respect for promotions of durable

products. We have followed the promotion profiles

described by Blattberg (1995) and Blattberg et al.

(1995).

3.1 Factors

For the productions of the simulated data in this

paper, five factors that influence the promotional

impact on sales such as Budget, Duration, Media,

Perceived Benefit and Price Change are being

considered. The order of the variables at this

moment does no indicate any level of importance.

The ranges used for each of the factors are the

following:

- Budget (B) ranges from 50 to 150 with each unit

to be equivalent to 1000€.

- Duration (D) ranges from 1 to 14 days.

- Media (M) is a categorical variable ranging from

1 to 4, where:

Table 1: Media factor.

Value Media Used

1 Newspaper

2 Newspaper + Radio

3 Newspaper + Radio + Internet

4 Newspaper + Radio + Internet + TV

- Perceived Benefit (PB): it is assumed that one of

the factors on which the success of the marketing

campaign is dependent is customer perception of the

product. This variable/factor is a gauge of the level

of benefit that the customers think he/she will get

from buying the product. Perceived Benefit is a

categorical variable ranging from 0 to 5, where 0

indicates that the customer does not think of any

benefit from buying the product, while 5 represents a

strong perceived benefit from buying the product.

- Price Change (PC): one of the main incentives

given to customers in a marketing campaign is a

reduction of the product price. This increases its

demand and subsequently its related sales. Price

Change varies from -20 to 15. This is a percentage

change. A negative value represents a decrease in

price and a positive value represents an increase in

price. It is assumed that the price will decrease by up

to 20% giving a value of -20 and increase by up to

15% giving a value of 15. A decrease in price will

increase demand and an increase in price may

decrease demand. The negative effect of increasing

the price can be countered by an advertising effort.

3.2 The Models

Different models have been developed. The criteria

are listed below:

• A model must use all 5 variables with ranges as

defined for each of the variables.

• A model must give a final output for the impact

of the promotion in the range of -20 to 120 for all

possible values of the (explanatory) variables.

• A complete model will be composed of two sub-

models, one being the linear model and the other

being the non-linear model.

• The linear model can only use the “+” and the “-

” operators while the non-linear model can only use

any combination of “+”, “-” and “*”, “/” operators.

For situations were a variable is raised to a power of

s, this will be considered equivalent to the

multiplication by s times.

• The importance rating for each of the 5 variables

must be the same for both the linear and nonlinear

model i.e. when variables are ranked in order of

importance, both models must have the same order

allowing for a meaningful comparison of the sub-

models when the number of factors increases.

The final models that have been used for running

200 instances (combinations of

factors_to_be_included x Level_of_Noise x

Level_of_Linearity) – each simulated with different

TURNING ARTIFICIAL NEURAL NETWORKS INTO A MARKETING SCIENCE TOOL - Modelling and Forecasting

the Impact of Sales Promotions

699

seed 15 times via random number generations within

each factor’s range, resulting in 3000 simulations,

are as follows:

- Linear

PBDBMPC 65.13.116.072.065.1 

with a range from -8.25 to 112.25

- Nonlinear

4.75

( (0.18 ) *(0.0118 0.035 ) 1.7 ) * 0.7 4

PC B D PB  

with a range from -8.25 to 112.25

An example of the simulated data is given in the

following Figure 1:

Figure 1: Linear model - low noise.

4 RESULTS

Alyuda NeuroIntelligence software (www.alyuda.

com) has been used for the experiments. We used

the free evaluation version of the software that was

user-friendly and provided full automation, and fast

processing times. The software was used as

described in the following flow chart (figure 2).

We used the Mean Absolute Percentage Error -

MAPE (Makridakis et al 1998) and the Mean

Absolute Relative Percentage Error (MARPE) for

the evaluation of the provided forecasts. The

advantage MARPE has over MAPE is that it is not

affected by small actual values as the error is

measured relative to the maximum range of the

output function based on a certain number of

parameters. So the range of the output function and

the noise level both depend on the number of

parameters in the equation.

As a benchmark forecasting methods we used a

Multiple Linear Regression model (Makridakis et al

1998, chapter 6) including all the exploratory

variables.

Figure 2: Forecasting and analysing Data with the ANN

software.

4.1 Linear Function - Low Noise

Table 2: Results for Linear Function + Low Noise.

ANN is performing worse than the benchmark.

4.2 Linear Function - High Noise

Table 3: Results for Linear Function + High Noise.

Increasing the level of noise for a linear function

increases the uncertainty in the values and adds an

unexplained component. For 1 to 3 parameters there

is no clear better method. For larger number of

ICAART 2011 - 3rd International Conference on Agents and Artificial Intelligence

700

parameters i.e. 4 and 5, Alyuda expert method

performs the best.

4.3 Non-linear Function - Low Noise

Table 4: Results for Non-Linear Function + Low Noise.

The Benchmark performs better than all other

methods up to 4 parameters and Ayuda expert

method perform the best for equations with 5

parameters.

4.4 Non-linear Function - High Noise

Table 5: Results for Non-Linear Function + High Noise.

By increasing the noise for nonlinear functions,

the error has increased for all methods due to the

increased uncertainty involved with the original

data. For non-linear models, ANN methods

outperform all other methods. For models with up to

2 parameters, predefined ANN’s perform best and

for equations with 3 to 5 parameters, Alyuda Expert

method outperforms all other methods.

4.5 Linear Function

Figure 3: Effect of Linearity.

For linear models with 2 and 3 parameters, there

is no obvious best performing method. For linear

models with one parameter, regression is the best

method and for 4 and 5 parameters, ANN’s are the

best set of forecasting tools.

4.6 Non-linear Function

NonLinear Combined MARPE

10.00

12.00

14.00

16.00

18.00

20.00

22.00

24.00

12345

Parameters

MARP E

Regression

ANN

Figure 4: Effect of Non-Linearity.

For nonlinear models, the regression only

outperforms ANN’s for 1 parameter. For nonlinear

models with 2 to 5 parameters, ANN’s are clearly

the best forecasting tool.

5 CONCLUSIONS

Our empirical findings suggest strongly that:

For linear models, regression approaches

perform better for smaller number of parameters i.e.

between 1 and 3, while Alyuda expert method

performs better for larger number of parameters

between 2 and 5. This indicated a grey zone where

any one method could outperform the other between

parameters 2 and 3.

For nonlinear models – that is the most difficult

and complex problem, ANN approaches outperform

all other methods. Predefined ANN’s work well for

up to two parameters and Alyuda Expert method

works well for higher number of parameters.

REFERENCES

Blattberg, R. C., Briesch, R. and Fox, E. J., 1995. How

Promotions Work, Marketing Science 14(3): 122-132.

Blattberg, R. C., 1995. Sales Promotion: Concepts,

Methods, and Strategies, Prentice Hall.

Makridakis, S., Wheelwright, S. C. and Hyndman, R. J.,

1998. Forecasting Methods and Applications, Wiley:

New York, 3

Edition.

Kuehn, A. A. and A. C. Rohloff (1967), “Evaluating

Promotions Using A Brand Switching Model,” in

Promotional Decisions Using Mathematical Models.

TURNING ARTIFICIAL NEURAL NETWORKS INTO A MARKETING SCIENCE TOOL - Modelling and Forecasting

the Impact of Sales Promotions

701

Patrick Robinson (Ed.), Boston: Allyn & Bacon, Inc.,

50-85.

Ehrenberg, A. S. C. (1972), Repeat Buying: Theory and

Applications. Amsterdam: North-Holland.

Guadagni, P. and J. Little (1983), “A Logit Model of

Coffee Choice Calibrated on Scanner Data” Marketing

Science, (Summer), 203-238.

Shoemaker, R. (1979), “An Analysis of Consumer

Reactions to Product Promotions” Proceedings of the

American Marketing Association, 246-248.

Battberg, G. Eppen and J. Lieberman (1981), “A

Theoretical and Empirical Evaluation of Price Deals

for Consumer Nondurables,’ Journal of Marketing, 45

(Winter), 116-129.

Shoemaker and F. R. Shoaf (1977), “Repeat Rates for

Deal Purchases,” Journal of Advertising Research, 17

(April), 47-53.

Dodson, J. A., A. M. Tybout and B. Sternthal (1978),

“Impact of Deals and Deal Retractions on Brand

Switching,’ Journal of Marketing Research, 15

(February), 72-81.

Blattberg, R. and S. K. Sen (1976), “Market Segments and

Stochastic Brand Choice Models,” Journal of

Marketing Research, 13 (February), 34-45.

Blattberg, T. Buesing, P. Peacock and S. Sen, (1978),

“Identifying the Deal Prone Segment,” Journal of

Marketing Research, 15 (August), 369-377.

ICAART 2011 - 3rd International Conference on Agents and Artificial Intelligence

702