The User-journey in Online Search
An Empirical Study of the Generic-to-Branded Spillover Effect
based on User-level Data
Florian Nottorf, Andreas Mastel and Burkhardt Funk
Institut f¨ur Elektronische Gesch¨aftsprozesse, Leuphana University, L¨uneburg, Germany
Keywords:
Online Search, Online Advertising, Consumer Behavior, Query Log, Spillover.
Abstract:
Traditional metrics in online advertising such as the click-through rate often take into account the users’ search
activities separately and do not consider any interactions between them. In understanding online search be-
havior, this fact may favor a certain group of search type and, therefore, may mislead managers in allocating
their financial spending efficiently. We analyzed a large query log for the occurrence of user-specific interac-
tion patterns within and across three different industries (clothing, healthcare, hotel) and were able to show
that users’ online search behavior is indeed a multi-stage process, whereas e.g. a product search for sneakers
typically begins with general, often referred to as generic, keywords which becomes narrowed as it proceeds
by including more specific, e.g. brand-related (“sneakers adidas”), keywords. Our method to analyze the de-
velopment of users’ search process within query logs helps managers to identify the role of specific activities
within a respective industry and to allocate their financial spending in paid search advertising accordingly.
1 INTRODUCTION
Selling advertising linked to user-generated queries,
the so-called sponsored search has become a criti-
cal component of companies marketing campaigns
(Ghose and Yang, 2008). Although more than half of
all search processes by individual users consist only
of one query (Jansen and Mullen, 2008), consumers
that have a transactional intention often do not reach
their goals by conducting only a single search (Search
Engine Watch, 2006). An aspect also confirmed by
(Rutz and Bucklin, 2011). They demonstrate the so-
called “spillover effect” from generic to brand-related
searches for a company in the hospitality industry:
the generic search (“hotel”) and the corresponding
advertisements by the hotel chain in question signifi-
cantly contributed to the fact that users later turned to
brand-related searches for this hotel chain (e.g. “ho-
tel hilton”) and finally to corresponding reservations.
Their initial work clarifies how traditional metrics
in online advertising such as the click-through rate
(CTR) are alone no adequate tools to control for paid
search advertising campaigns: they only take into ac-
count the users’ search activities singularly and do not
consider any interactions between them.
Our analysis is based on a complete query log
published by AOL in 2006 (Pass et al., 2006) and ex-
plains users’ search activities in more behavioral de-
tail. Unlike (Rutz and Bucklin, 2011), who used
keyword-level data aggregated on a daily basis of a
paid search advertising campaign of a single com-
pany, we analyzed users’ individual queries within
and across entire industries in order to determine
whether the resulting user-journey showed behavior
indicating spillover effects. As evidence of such a
spillover we regarded a user-journey that, for exam-
ple, started with a generic and was followed by a
brand-related search. We investigated and confirmed
the spillover effect and its occurrence in users’ online
search behavior for several industries and, by doing
so, highlighted the role of generic activities as gate-
keepers for companies’ online advertising. To the
best of our knowledge, our work is the first to ana-
lyze a complete query log for the occurrence of the
spillover effect and thus makes a contribution to re-
search on consumers’ online behavior. In addition, an
improved understanding of the role of generic activ-
ities and of how consumers actually search for prod-
ucts and brands to satisfy their needs will help ad-
vertisers to allocate their budget on online advertising
more efficiently.
The paper is structured as follows: first, we will
review existing work on consumer online search be-
havior in general as well as in the specific context of
145
Nottorf F., Mastel A. and Funk B..
The User-journey in Online Search - An Empirical Study of the Generic-to-Branded Spillover Effect based on User-level Data.
DOI: 10.5220/0004052101450154
In Proceedings of the International Conference on Data Communication Networking, e-Business and Optical Communication Systems (ICE-B-2012),
pages 145-154
ISBN: 978-989-8565-23-5
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
paid search advertising. In the next chapters we will
describe our method of analyzing spillover effects in
query logs and introduce our dataset together with the
filtering procedures applied. Next, we focus on the re-
sults of spillover effects across several industries. The
last sections contain a discussion of our findings and
will close this paper by mentioning the limitations of
our study and by giving suggestions for further inves-
tigations.
2 RELATED WORK
The detailed records of users’ Internet activities
opened up the possibility of analyzing a variety of
topics, such as consumers’ online search behavior
(see, for example, (Bucklin and Sismeiro, 2009) for a
review and discussion of strengths and limitations of
clickstream data for marketing research). General re-
search classifies consumers’ online searches into nav-
igational, transactional, and informational purposes
(Broder, 2002; Jansen and Spink, 2007). According
to (Moe, 2003), searches heavily depend on the in-
dividual purchase intent such as involvement: while
directed-buying sessions present very narrowly aimed
shopping behavior, consumers with low purchase in-
tention exhibit much broader search patterns for un-
specific products. (Johnson et al., 2004) confirm that
the depth of consumer search is generally low and
shows no increase with a consumers growing expe-
rience. This aspect is confirmed by analyses showing
that the CTR on a search engine’s (sponsored) link
decreases with its position (Agarwal et al., 2011; Ani-
mesh et al., 2011; Ghose and Yang, 2009; Rutz et al.,
2011). Despite the amount of literature focusing on
users’ search behavior no work has yet considered
possible differences of this behavior across industries
- an aspect considered and analyzed in more detail in
the present paper.
The empirical analysis of sponsored search has
only recently begun to become the focus of the sci-
entific community. (Ghose and Yang, 2009) exam-
ine the general impact of paid search advertising on
measures such as the relationship between the type
and length of keywords and different variables on
consumers’ click and conversion behavior. Although
the authors uncover important differences in the click
and purchasing intensity regarding a specific group
of keywords, they miss to account for users’ interac-
tions between them. (Chan et al., 2011) build an in-
tegrated model of customer lifetime, transaction rate,
and gross margin accounting for spillovers from spon-
sored search on customer acquisition and behavior in
offline channels. Their results indicate that customers
who were originally acquired through paid search ad-
vertising on Google have a significant higher life-
time value (about 20%) than customers acquired from
other channels. In a recent study, (Rutz et al., 2011)
criticize the common assumption that users respond
homogeneously to keywords. The authors formulate
a consumer-level approach to especially evaluate tex-
tual properties of paid search ads on consumers’ re-
sponses and account in this way for heterogeneity.
Besides several findings referring to the consumer-
integrated focus, the authors confirm that keyword-
specific factors, like the distinction between broad
and narrow purposes, are important when linking
searches to a CTR (see also (Ghose and Yang, 2009)).
This stream of research typically uses aggregated-
level data. Although the majority of papers discussed
above claim a more behavioral focus on how users re-
spond and act in the context of search engines, they
only scratch the surface of an analysis of consumers’
actual online search behavior (see (Abhishek et al.,
2011) for discussion of aggregation bias in sponsored
search data).
This paper is most closely related to (Rutz and
Bucklin, 2011), who are able to demonstrate the
spillover effect from generic to brand-related searches
for a company in the hospitality industry. The generic
search (“hotel”) and the corresponding ads of the ad-
vertising hotel chain significantly contribute to the
fact that users later turn to brand-related searches for
this hotel chain (e.g. “hotel hilton”) and finally to
corresponding reservations. We will explore the dif-
ferences between Rutz and Bucklin’s method and our
approach in the following chapter more closely.
3 DATA AND METHODOLOGY
Our analysis is based on a log published by AOL
which consists of over 35 million queries from about
650,000 users over a three month period (March to
May) (Pass et al., 2006).
1
Although this dataset dates
back to 2006, it is still an unique and comprehensive
query log containing extraordinary information about
users’ search and click behavior, which may be found
in todays search engines like Google or Bing.
2
In terms of analyzing user-journeys for specific
behavioral aspects, such as the spillover effect from
generic to brand-related searches, we, first, needed
to define industries and, second, companies within
1
Since the users who represent the queries are mostly
located in the United States of America, our work is mainly
based on the US region.
2
We would like to thank an anonymous reviewer for
pointing out this more clearly.
ICE-B2012-InternationalConferenceone-Business
146
these. It was only then that we were able to cat-
egorize user queries into generic and brand-related
types of searches respectively and to make a definite
statement about both the existence and the extent of
industry-specific interaction effects between generic
and brand-related search activities.
We will explain this process in more detail in the
following chapters. Please find all additional data and
information such as the list of selected companies,
keywords, or our final filtered query log on the au-
thor’s website: http://www.nottorf.org.
3.1 The Initial Query Log
The log includes 36,389,567 records structured in five
columns (see Table 1 for a short excerpt of the data):
AnonID: A unique identification number of an
anonymized user.
Query: The user’s query.
QueryTime: The point of time at which the query
was submitted for search.
ItemRank: In case a user clicked on one of the
result pages, the “ItemRank” shows the site’s po-
sition in the result pages. If no page was selected,
this field remained empty.
ClickURL: Shows the URL of the clicked result
(Pass et al., 2006).
We focused on three industries: the hotel and
hospitality industry to make our findings compara-
ble with those of (Rutz and Bucklin, 2011); the
clothing industry as a representative of nondurable
goods with the influence of brand strength being as-
sumed to be strong; and the healthcare industry as
a representative of the insurance sector. We further
restricted our analysis to the top ten companies in
2006, their ranking being based on revenue and brand
strength within a certain industry (Interbrand & Busi-
ness Week, 2006). We did so on the assumption
that these companies represented the major number of
possible brand-related search activities within a spe-
cific industry. The consequences of this approach will
be discussed later, for example of the fact that we
did not consider the total number of brand-related or
generic search activities recorded in the query log.
On the basis of the industries selected and each of
the ten companies we filtered out generic and brand-
related queries from the initial query log.
3.2 The Filtering Process for Queries
For each of the ten companies within each industry we
defined brand-related keywords to analyze a spe-
cific query for their occurrence and, as the case may
Figure 1: Structure of the filtering process of brand-related
keywords.
be, marked the query as a branded search. Because
of the fact that many of the selected companies have
subsidiaries, we had to define a set of keywords that
were related to the parent-company.
3
We organized
the brand names and keywords in hierarchical order as
shown in Figure 1. Based on this structure we applied
tools (e.g. the online toolset given by Google Ad-
Words) to identify keywords that are often searched
for in the context of the brand names mentioned.
We started with the company names (e.g. “Nike
Inc.”), which were placed in the root section of this
structure. Next to it we put brand and product names
which we derived from the company’s information it-
self (e.g. “NIKEDiD” or Air Jordan”) and from the
help of the keyword tools mentioned above. On the
basis of these subcategories was defined the final set
of keywords (e.g. “nikeid”, “nikid”, or “jordans”) for
our analysis (see the right column of Figure 1). In this
section were also taken into account variant names
and typing errors. However, we defined 120 brand-
related keywords for the healthcare, 228 for the hotel,
and 720 for the clothing industry.
Defining the set of generic keywords required us
to limit which queries could directly be related to the
mentioned industries. Therefore, we may have had
to consider those queries that had no direct relation-
ship to a specific industry and its companies or prod-
ucts, but still were able to generate clicks on links to
these companies. For example, the analysis of the ho-
tel industry raised the question whether the search for
a country or a flight could have already be seen as a
generic query or not. It is possible that a selection of a
final dataset that is too comprehensivemay also cover
users who do not have the intention to purchase some-
thing at all. This is a problem every paid search adver-
tising campaign has to face to some extent, since ad-
vertisers want to become displayed and ranked when
3
The Harrah’s Entertainment Inc., for example, had
about 21 hotels and hotel chains in 2006.
TheUser-journeyinOnlineSearch-AnEmpiricalStudyoftheGeneric-to-BrandedSpilloverEffectbasedonUser-level
Data
147
Table 1: Extract from the AOL dataset.
AnonID Query QueryTime ItemRank ClickURL
1927 does bcbs cover ci 03.05.2006 00:24
1927 does bcbs fl cover ci 03.05.2006 00:25
7117 www.anthem.com 09.05.2006 09:33 3 http://www.maine.nea.org
7117 www.anthem.com 09.05.2006 09:33 4 http://hr.nd.edu
7117 www.anthem.com 10.04.2006 06:57 1 http://www.myuhc.com
a possible consumer searches for products or services
that the respective company offers. The consequences
of a potentially incorrect or incomplete selection will
be discussed later.
We proceeded from the advertiser’s point of view
and defined generic keywords that reflected the in-
tention to gather information about the product or to
purchase it. Hence, we restricted our set of generic
keywords to directly product-related terms, their syn-
onyms and variations. To achieve this, we also made
use of the keyword tools referred to above but mostly
we derived the keywords manually by gathering gen-
eral information from the companies’ websites. Thus,
we defined 196 generic keywords for the healthcare
(e.g. “health care”, “dental insurance”, “medicare”),
335 for the hotel (e.g. “hotel”, “motel”, “suites”), and
197 for the clothing industry (e.g. “shoes”, “shorts”,
“underwear”).
On the basis of our brand-related and generic key-
words we filtered and categorized the query log
records (see Figure 2). First, all log records whose
“Query”-columns contained brand-related keywords
were moved into an industry-specific table and were
marked as brand-related queries. This ensured that a
record with both brand-related and generic keywords
appearing in the same query was not duplicated. The
second step was to filter the reduced dataset (with-
out brand-related queries) on the basis of generic
keywords, which were moved to the three industry-
specific tables with only generic queries. Ideally, the
remaining dataset should no longer have contained
any relevant queries. It is important to mention that
our procedure also filtered out log records that were
not related to the industries selected. This was due
to the fact that some queries occur also within an ir-
relevant context. We handled this problem by manu-
ally scanning the filtered data and by deleting records
giving information about an irrelevant context (e.g.
queries in a pornographic context).
The records not only contain information about
a user’s query itself but also the URL of a clicked
result. We distinguished between URLs on pages
that were related to the ten companies defined
above and between URLs that were not, such as
websites of retailers (e.g. “www.ebay.com” or
“www.amazon.com”). Thus, we were able to sepa-
rate relevant clicks (from the perspective of the ten
companies) from irrelevant ones and were in a posi-
tion to make more precise statements about the effects
of generic and/or brand-related searches. To deter-
mine those (relevant) clicks we analyzed the informa-
tion contained in the “ClickURL”-column of our fi-
nal dataset and manually checked whether the clicked
pages were related to the companies examined. Fol-
lowing this procedure, we identified 244 websites for
the clothing, 298 for the healthcare, and 838 for the
hotel industry.
The total number of unique users, of impressions,
and of (relevant) clicks in our final dataset for each
industry following our filtering procedure are shown
in Table 2. The resulting descriptive statistics con-
firm strong differences in efficiency between generic
and brand-related keywords. Although we filtered out
Table 2: Descriptive statistics on the final dataset.
Hotel Users Imp. Clicks CTR
Generic 71,405 327,563 3,801 1.16%
Branded 32,039 97,070 21,154 21.79%
Total 84,408 424,633 24,955 5.88%
Clothing Users Imp. Clicks CTR
Generic 51,105 293,645 350 0.12%
Branded 14,225 46,586 5,695 12.22%
Total 58,166 340,231 6,045 1.78%
Healthcare Users Imp. Clicks CTR
Generic 10,876 33,115 719 2.17%
Branded 9,304 20,516 7,896 38.49%
Total 18,255 53,631 8,615 16.01%
far more generic than brand-related queries from the
initial query log, the number of clicks in response
to branded searches was much higher. This fact re-
sults in quite different CTRs as already shown by
e.g. (Ghose and Yang, 2009; Ghose and Yang, 2010),
(Rutz and Bucklin, 2011), and (Yang and Ghose,
2010). Companies focusing only on these statis-
tics might conclude that the concentration on brand-
related keywords and the neglect of generic terms
would increase profitability since the metrics for key-
ICE-B2012-InternationalConferenceone-Business
148
Figure 2: Schematic illustration of the filtering procedure for a given industry.
words containing brand-specific information seem to
be more effective than for those keywords describing
generic purposes. The major problem of these types
of keyword-based analyses is, however, that they only
link users’ actions to a specific keyword one at a time
and do not consider any interactions between several
searches and clicks. We, on the other hand, aimed at
a processual analysis of the data and considered the
development of users’ search activities over time.
3.3 Indicating Spillover Behavior in
Query Logs
The development of the search process from generic
to brand-related searches may be attributed to the fact
that individual brands gain the users’ attention dur-
ing the search process. We defined two levels of the
spillover effect in user-journeys:
Spillover Behavior Level 1. A user-journey shows a
generic-to-branded spillover effect, when a user first
searches for generic and next, at any further time, for
brand-related terms.
Spillover Behavior Level 2. A user-journey shows a
generic-to-branded spillover effect, when a user first
searches for a generic term and her last search that
leads to a (relevant) click is a brand-related one.
For each of the two definitions we assigned
all considered user-journeys to the four fields in
a 2x2 matrix (genericbranded, genericgeneric,
brandedbranded, brandedgeneric). For the
level 2 spillover effect we obviously had less user-
journeys than for level 1, since we required those
users to had at least one (relevant) click in their jour-
neys.
Our analysis is different from the one of (Rutz
and Bucklin, 2011) as we utilize user-level instead of
keyword-level data. Rutz and Bucklin use the daily
number of generic searches and clicks as indepen-
dent variables modeling a latent construct of aware-
ness which in turn affects the number of brand-related
searches. In addition, they use data from a paid search
advertising campaign of one company thus having a
clear boundary of the study.
The fact that we built our analysis on a complete
query log further enabled us to consider user-specific
behavior within a whole industry instead of analyz-
ing keyword-level data aggregated on a daily basis for
just one company. We captured all search activities
for three different industries and determined the num-
ber of users who searched for either generic or brand-
related terms only, alternatively, performed spillovers
from generic to brand-related searches and vice versa.
In addition, we were able to establish the point of time
when a user performed an action and analyzed the ex-
act time span in which possible spillover effects might
occur.
4 RESULTS
4.1 Spillover Results
The results of the spillover analysis for each industry
following the level 1 spillover definition are shown
in Table 3. It shows the proportion of all unique users
in relation to each group of user-journeys, resulting in
a 2x2 matrix for each industry.
The results indicate that the interaction effects be-
tween generic and brand-related searches differ across
industries. Take, for instance, the healthcare indus-
try. Here, 41.1% of all users conducted only brand-
related and 48.3% only generic searches. Note that
these homogeneous groups also contain user-journeys
with only a single search. The remaining 10.6% of all
users who switched either from generic to branded or
from branded to generic searches during their user-
journeys divide nearly equally into the two hybrid
groups. There is no clear sign of users’ favoring one
specific interaction direction within the healthcare in-
dustry.
The results for the clothing industry, on the other
hand, indicate at least a small spillover behavior from
generic to brand-related search activities. We found
TheUser-journeyinOnlineSearch-AnEmpiricalStudyoftheGeneric-to-BrandedSpilloverEffectbasedonUser-level
Data
149
Table 3: Proportions of rst search activities within user-
journeys per industry (spillover behavior level 1).
Hotel to generic to branded
from branded 8.6% (7,258) 20.8% (17,523)
from generic 56.7% (47,849) 13.9% (11,778)
Clothing to generic to branded
from branded 4.7% (2,720) 15.0% (8,784)
from generic 72.6% (42,218) 7.7% (4,444)
Healthcare to generic to branded
from branded 4.9% (897) 41.1% (7,509)
from generic 48.3% (8,821) 5.7% (1,028)
Note: Both the percentage and the total number (in
brackets) of all unique users within one of the
journey-groups are represented for each industry.
a large number of users searching only for generic
terms (72.6%) while a much smaller group searched
only for brand-related ones (15.0%). Although there
were no more than 12.4% of users in total conduct-
ing hybrid searches, there were 3.0% more user-
journeys starting with a generic search that was fol-
lowed by a brand-related one (7.7%) than in the op-
posite group (4.7%). Users looking for articles in
the clothing industry, such as shirts, shoes, or un-
derwear, seemed to switch more likely their type of
search from generic to brand-related terms than from
brand-related to generic ones.
An obvious sign of spillover behavior was shown
by the results for the hotel industry. Here, nearly 14%
of all users switched to brand-related searches after
they initially searched for generic terms. Although the
opposite group, starting with branded and switching
to generic terms, was also relatively large (8.6%), it
significantly differed from the actual spillover group.
This can be seen as evidence for the fact that users’
search behavior within the hotel industry initially
started with broad and general search terms (e.g.
“hotel”, “bed and breakfast”, “suite”) which became
more (brand-)specific as the search proceeded (e.g.
“harrah”, “sheraton”, “hyatt”).
Table 4 shows the results for the level 2 spillover
effect. The focus in this spillover definition on users’
clicks on (relevant) links led to a shift in favor of
brand-related keywords. This is not surprising since
the brand-related queries received far more clicks
compared to generic ones (see Table 2). But, similar
to the differentiation into an exploratory and a goal-
directed searching mode (e.g. (Janiszewski, 1998)
and (Moe, 2003)), this alternative spillover analysis
has the ability to differentiate between users intending
to purchase something (which results in a click) and
users behaving in a less goal-directed manner (result-
ing in no click on a company’s website). We acknowl-
Table 4: Percentages of first search activity and last search
activity before a click within user-journeys per industry
(Spillover behavior level 2).
Hotel to generic to branded
from branded 2.7% (338) 50.9% (6,428)
from generic 12.7% (1,602) 33.7% (4,206)
Clothing to generic to branded
from branded 0.9% (36) 63.9% (2,532)
from generic 6.2% (244) 29.0% (1,150)
Healthcare to generic to branded
from branded 1.0% (58) 80.0% (4,206)
from generic 9.0% (471) 10.0% (523)
Note: Both the percentages and the total number (in
brackets) of all unique users within one of the
journey-groups are represented for each industry.
edge that there might be a purchase intention even if
a user did not click on a link to a company’s website
(e.g., that a user clicks on a third party’s link, such as
Amazon or Ebay).
This analysis shows very strong evidence for
spillover behavior. See, for example, the results
for the hotel industry. Here, 33.7% of all users
that clicked on a company’s link started their user-
journeys with a generic search and switched to a
brand-related one. This means that more than one
third of these users first looked for generic terms
before they specified their searches using brand-
related keywords and finally clicked and ended their
search-to-click-processes. The magnitude of the ef-
fect for the hotel industry becomes even more pro-
nounced, since the reverse-spillover effect (users that
first searched for brand-related keywords before they
conducted a generic search and clicked on a com-
pany’s link) was the smallest of all four groups by
far.
Similar findings to the hotel industry can be found
for the clothing sector. Here, about 29% of all users
that clicked on a company’s link looked for brand-
related terms after they conducted generic searches.
The reverse-spillover is insignificantly small, since
less than 1% of the users first searched for brand-
related and afterwards for generic terms before they
clicked.
Focusing on the healthcare industry, the findings
of the level 1 spillover analysis can be partly con-
firmed. Again, a large number of users seemed to
search only for brand-related or generic terms be-
fore clicking on a respective link. Nearly 90% of
all users who clicked on a company’s link did not
switch their type of search, that is they searched only
for either brand-related or generic terms. This fact
had an immediate influence on the strength of the
ICE-B2012-InternationalConferenceone-Business
150
Table 5: Time spans for each industry in which the
Spillovers occurred.
Quantiles of time difference in days
Industries 10% 25% 50% 75% 90%
Hotel
Level 1 0.01 0.10 8.95 30.13 55.47
Level 2 0.03 4.85 22.87 51.17 70.15
Clothing
Level 1 0.01 1.73 14.91 37.54 59.07
Level 2 0.04 6.86 23.12 47.86 66.70
Healthcare
Level 1 0.00 0.02 4.95 22.01 49.91
Level 2 0.01 0.72 12.07 37.26 60.90
Note: “Level 1” denotes the time differences in “Spillover
Behavior Level 1”. “Level 2” denotes the time differences
in “Spillover Behavior Level 2” accordingly.
spillover effect from generic to brand-related search:
only a relatively small number of users’ (nearly 10%)
first searched for generic terms and next conducted
a brand-related one before they clicked on a corre-
sponding link.
We are not able to fully explain this relatively
small number of interactions between generic and
branded terms, especially when compared to our find-
ings for the hotel and clothing industry. Still, there
are some differences in the data of the three industries
(see Table 2) that may at least give a clue to this rela-
tive lack of interactions. For example, the total num-
ber of impressions within the healthcare industry is
considerably smaller than in the two others. In partic-
ular, there are about ten times fewer generic impres-
sions. Also, the CTR of the brand-related keywords is
by far the highest. This can already be seen as an in-
dicator for a relatively small interaction-rate between
generic and brand-related searches, since these two
metrics indicate that searches of a single type alone
seem to be able to lead to a possible solution for the
user.
4.2 Time Differences between Actions
Table 5 shows the time spans and the quantiles for
each industry within which the generic-to-branded
spillover effects occurred. “Level 1” denotes the time
differences of spillover behavior from (first) generic
to (first) brand-related searches (see the above def-
inition of “Spillover Behavior Level 1”). The time
that elapsed between the users’ first generic search
and last brand-related one that resulted in a click on
a company’s link is denoted by “Level 2” (“Spillover
Behavior Level 2”).
Let us first focus on the hotel industry. The
results of the analysis of the time differences for
the first spillover definition suggest that 25% of all
users switched from generic to brand-related searches
within only 2.4 hours. These users changed their
search behavior very quickly when compared with the
average time span of about 9 days (8.95) in which the
generic-to-branded spillover occurred. As expected,
the “Level 2”-results indicate a longer time difference
since we focused on users’ last brand-related search
that led to a click on a company’s link. Overall, our
findings are consistent with the results of (Rutz and
Bucklin, 2011) who find that the search process for
lodging seems to be short and to occur mainly in be-
tween a couple of days to two weeks. We confirm an
even shorter time difference on the basis of our results
since we found many users who switched their type of
search in between only a few hours and up to one day.
That rest of the users spread their search process over
the further investigation period, see Figure 3, seems
likely to be a coincidence.
The results for the clothing industry show some
deviations from those for the hotel industry. Although
we expected a search process for garments to change
in a shorter period of time from generic to brand-
related terms, as it does in searches for the right ho-
tel, this was not borne out by the findings for this in-
dustry. Indeed, the mean time span of the spillover
behavior was about 2 to 3 weeks. This can be ex-
plained by the possibility of users being confronted
with far more possibilities of choice compared to the
hotel industry since competition in the clothing in-
dustry is not restricted to a certain location. In other
words, a user searching for a hotel in a specific loca-
tion (e.g. “hotel barcelona”) might not have as many
choice alternatives when searching for an appropriate
room as a user looking for a specific garment (e.g.
“hoodie jacket”). This circumstance could reduce the
time users needed to spend on finding the right hotel.
Another explanation of the relatively long time differ-
ences compared to the hotel industry is that searching
for the garments might not be that selectively inten-
sive and target-oriented as looking for the right hotel.
In the healthcare industry, the time spans in which
the spillover occurred were the shortest by far. Focus-
ing on the “Level 1”-results, 25% of all users whose
journey showed spillover behavior switched the type
of search within half an hour (0.02). The results of
“Level 2” confirm this short interaction period since
more than one out of four of all spillovers did not take
more than one day eventhough considering the clicks.
Following our previous chain of reasoning, this short
period between switching from one search to the other
appears reasonable: the (health) insurance sector is
easily the least fragmented in comparison with the
other two. This is demonstrated, for instance, by the
TheUser-journeyinOnlineSearch-AnEmpiricalStudyoftheGeneric-to-BrandedSpilloverEffectbasedonUser-level
Data
151
0 20 40 60 80
Hotel
Clothing
Healthcare
0 20 40 60 80
Hotel
Clothing
Healthcare
Figure 3: Time spans within which “Spillover Behavior Level 1” (Left) and “Spillover Behavior Level 2” (right) occurred.
total number of brand-related keywords that we de-
fined for this industry (120) as opposed to the ho-
tel (190) and clothing industry (720) industries. This
limited number of keywords increases the probabil-
ity of users switching to a brand-related search in a
shorter period of time. Either they found the “right”
company more easily or simply were in a more goal-
directed mode compared to users searching within the
clothing industry.
All results are additionally illustrated in Fig-
ure 3, which displays the respective number of user-
journeys for each industry that showed a very short
time span in which the spillover effects occurred (see
the cumulative density at point “0 Days”). Further,
it can be seen that each time span slowly approxi-
mates the total investigation period of 90 days. The
analysis of the time difference provides evidence for
both a substantial percentage of users that switched
from generic to brand-related searches in a very short
period of time (less than one day or several hours)
and the average percentage of users that changed their
type of search over several days or weeks.
5 DISCUSSION
Our results emphasize the role of generic search ac-
tivity in the users’ search and decision process since
it is crucially important for companies and individual
brands to gain the users’ attention during the search
process and thus probably to be considered a poten-
tial solution.
By investigating online search behavior we were
able to show that traditional research vindicates and
can be applied in the online setting. Following this
literature, consumers’ decision processes can be di-
vided into at least two stages: the first is a set of
products or brands a consumer is aware of at any
given point of time. This is referred to as the “aware-
ness set”. In the second, the so-called “evoked set”,
a consumer is likely to reduce his or her set of prod-
ucts that the final decision is based upon (Howard and
Sheth, 1968). Therefore, the chance of a brand being
considered for purchase does not exist if that brand
is not part of a consumer’s awareness set (Narayana
and Markin, 1975). When a consumer does not know
which brand may satisfy his or her initial need (e.g.,
has no distinctive evoked or awareness set), a search-
process will more likely begin with general searches
and show a narrowing by becoming (brand-)specific
as it proceeds. We can confirm this assumption within
the setting of online search, since it is indeed crucial
for companies to gain the users’ attention during the
early stage of their search and decision processes.
With respect to understanding users’ search be-
havior in general and with respect to paid search ad-
vertising in specific, we agree with the work of (Rutz
and Bucklin, 2011), since we found the generic search
activities of primary importance for the users’ search
and decision process: although generic searches re-
ceive far less clicks as brand-related ones, they are
indeed indispensable in managing online advertising
campaigns. Since there are also strong differences
in the extent of the spillover effects (e.g. hotel vs.
healthcare industry), it is recommended for each com-
pany to identify the degree of this interaction effect
between generic and brand-related activities and ad-
just their spending accordingly.
The differences of the extent of the spillover be-
havior across industries might not only result from
different market and competition set-ups, but might
also be due to different factors influencing users’
search behavior, such as involvement.
Searching for the right hotel room or for the best
health insurance may be more functionally driven
ICE-B2012-InternationalConferenceone-Business
152
than choosing the right sneakers or underwear. A
further differentiation into users with a goal-directed
searching mode and those with a low purchase inten-
tion may reveal some more insights into the extent of
the spillover effect. We focused on that circumstance
by considering clicks on companies’ websites (def-
inition Spillover Behavior Level 2) and have found
a much stronger spillover behavior in the first group
compared to users that search only. Further research
should investigate this differentiation in more detail.
The opportunity of investigating users’ search be-
havior upon a very large query log suffers from a vari-
ety of drawbacks. Turning to our filtering and catego-
rization procedures, there may be several sources that
could skew our findings. Although we have been very
detailed on the keyword-creation-process, we still
might not have captured all of the generic and brand-
related search activities. For instance, we may have
defined too few brand-related keywords, which would
result in too few brand-related queries being filtered
out of the initial query log. If, on the other hand,
we had filtered out more brand-related queries, this
would have, under constant conditions and chargeable
to the generic homogeneous group, increased both
the hybrid and the homogeneous (“from branded -
to branded”) search groups proportionally. We also
could have picked out more (unknown) companies
instead of focusing on the top ten for each industry.
Unknown brands might not receive as many homoge-
neous searches as the top ten industries. Addition-
ally, users would be more likely to become aware
of these unknown brands through generic searches,
which probably would have increased the spillover ef-
fect from generic to branded searches. Restricting our
study to the top companies therefore can be seen as
resulting in conservative figures since it more likely
leads to an underestimation than an overestimation of
the spillover effect.
6 CONCLUSIONS
To better understand the search behavior of users we
analyzed the query log published by AOL in 2006
(Pass et al., 2006) for the occurrence of the spillover
effect first described by (Rutz and Bucklin, 2011).
Our results confirm the occurrence of spillover effects
in online search behavior, but also show that these ef-
fects differ between industries.
Starting from the initial AOL query log we (1)
filtered out every brand-related log record that could
refer to one of the selected companies or their prod-
ucts, and (2) filtered out log records of the remain-
ing queries that contained generic terms for a specific
industry. Within our analysis we found more user-
journeys starting with generic and switching to brand-
related searches than the other way round. Among the
three industries selected, the effect is most noticeable
in the hotel, closely followed by the clothing industry.
The weakest degree of a spillover was detected in the
healthcare industry.
We could prove the key role of generic activi-
ties and the early anchoring of companies and brands
within user-journeys. Both our understanding of
users’ search activity as a process and our findings
of the spillover effect within user-journeys allowed
us to transfer the theoretical concepts of the evoked
and (un-)awareness set of (Howard and Sheth, 1968)
and (Narayana and Markin, 1975) to the user’s search
and decision process in the context of search en-
gines: the development of the search process from
generic to brand-related searches may at least par-
tially be attributed to the fact that individual brands
gain the users’ attention during the search process,
who thus “become aware” of these companies or
brands. The focus on generic activities becomes in-
dispensable since they seem to play the role of gate-
keeper for companies in online advertising.
This paper has several limitations. Although we
have been very detailed on the data filtering process
there might be effects that blurred our results, since
the manually selected and revisited generic keywords
for each industry as well as the brand-related ones for
each company might not capture all users and his or
her search intentions. Another limitation is the miss-
ing information on conversions which would have en-
abled us to formulate a more specific occurrence of
the spillover effect. Our approach, therefore, neglects
the fact that one user might have made several pur-
chases during the investigation period within one in-
dustry. An alternate dataset might overcome these
limitations in a further investigation.
Although we analyzed several industries in this
paper, we neither considered possible interactions be-
tween them nor between companies within these (e.g.
spillovers from company A to company B). Since, for
example, (Ghose and Yang, 2010) found that there
are cross-category purchases within one company for
several keywords in sponsored search, a complete
query log to some extent contains information for
such a further analysis. Our work can be seen as a
first step in directing attention more to the behavioral
aspects of users’ online search activities.
TheUser-journeyinOnlineSearch-AnEmpiricalStudyoftheGeneric-to-BrandedSpilloverEffectbasedonUser-level
Data
153
REFERENCES
Abhishek, V., Hosanagar, K., and Fader, P. S. (2011). On
aggregation bias in sponsored search data: Existence
and implications.
Agarwal, A., Hosanagar, K., and Smith, M. D. (2011). Lo-
cation, location, location: An analysis of profitibality
of position in online advertising markets. Journal of
Marketing Research.
Animesh, A., Viswanathan, S., and Agarwal, R. (2011).
Competing “creatively” in sponsored search markets:
The effect of rank, differentiation strategy, and com-
petition on performance. Information Systems Re-
search, 22(1):153–169.
Broder, A. (2002). A taxonomy of web search. SIGIR Fo-
rum, 36(2):3–10.
Bucklin, R. E. and Sismeiro, C. (2009). Click here for
internet insight: Advances in clickstream data anal-
ysis in marketing. Journal of Interactive Marketing,
23(1):35–48.
Chan, T. Y., Xie, Y., and Wu, C. (2011). Measuring the life-
time value of customers acquired from google search
advertising. Marketing Science.
Ghose, A. and Yang, S. (2008). Comparing performance
metrics in organic search with sponsored search ad-
vertising.
Ghose, A. and Yang, S. (2009). An empirical analysis of
search engine advertising: Sponsored search in elec-
tronic markets. Management Science, 55(10):1605–
1622.
Ghose, A. and Yang, S. (2010). Modeling cross-category
purchases in sponsored search advertising.
Howard, J. A. and Sheth, J. N. (1968). A theory of buyer be-
havior. Rivista internazionale di scienze economiche
e commerciali, 15(6):589–618.
Interbrand & Business Week (2006). Interbrand’s best
global brands 2006.
Janiszewski, C. (1998). The influence of display character-
istics on visual exploratory search behavior. Journal
of Consumer Research, 25(3):290–301.
Jansen, B. J. and Mullen, T. (2008). Sponsored search: an
overview of the concept, history, and technology. In-
ternational Journal of Electronic Business, 6(2):114–
131.
Jansen, B. J. and Spink, A. (2007). The effect on click-
through of combining sponsored and non-sponsored
search engine results in a single listing. Proc. 2007
Workshop on Sponsored Search Auctions, Banff, AB,
Canada.
Johnson, E. J., Moe, W. W., Fader, P. S., Bellman, S.,
and Lohse, G. L. (2004). On the depth and dynam-
ics of online search behavior. Management Science,
50(3):299–308.
Moe, W. W. (2003). Buying, searching, or browsing: Differ-
entiating between online shoppers using in-store nav-
igational clickstream. Journal of Consumer Psychol-
ogy, 13(1-2):29–39.
Narayana, C. L. and Markin, R. J. (1975). Consumer be-
havior and product performance: An alternative con-
ceptualization. The Journal of Marketing, 39(4):1–6.
Pass, G., Chowdhury, A., and Torgeson, C. (2006). A pic-
ture of search. In Proceedings of the 1st international
conference on Scalable information systems, InfoS-
cale ’06, New York, NY, USA. ACM.
Rutz, O. J. and Bucklin, R. E. (2011). From generic to
branded: A model of spillover in paid search adver-
tising. Journal of Marketing Research, 48(1):87–102.
Rutz, O. J., Trusov, M., and Bucklin, R. E. (2011). Model-
ing indirect effects of paid search advertising: Which
keywords lead to more future visits? Marketing Sci-
ence, 30:646–665.
Search Engine Watch (2006). Delv-
ing deep inside the searcher’s mind.
http://searchenginewatch.com/3406911.
Yang, S. and Ghose, A. (2010). Analyzing the relationship
between organic and sponsored search advertising:
Positive, negative, or zero interdependence? Market-
ing Science, 29(4):602–623.
ICE-B2012-InternationalConferenceone-Business
154