potential clients. In this context, a web workload
model should reproduce the HTTP requests that a real
web server typically receives in daily working ses-
sions.
Nevertheless, the implicit dynamism in users’ be-
havior and in the generation of web content makes
hard to design accurate web workload models that
represent users’ navigations. On the one hand, users
require customized information and advanced ser-
vices from web applications. This requirement im-
plies that the contents are dynamically generated from
different data sources. On the other hand, along a
typical navigation, users use to select pages to visit
depending on the previously visited ones, and particu-
larly on their contents or characteristics such as the re-
sponse time. Thus, the dynamism in the content gen-
eration implies that the users’ navigation may differ
depending on the contents generated by the web ap-
plication during a given period of time. In summary,
the dynamic content introduces dynamism in users’
navigation, that is, it produces dynamic users’ behav-
ior. For example, a high percentage of users’ naviga-
tion sessions begins searching a dynamic resource in
a specialized site and then it continues visiting one or
more sites, depending on the results of the previous
searches.
In general, the main challenges when modeling
web workloads are: i) how to model users’ dynamism,
ii) how to represent the different roles that users play
in the web, and iii) how to model continuous changes
in users’ behavior (Weinreich et al., 2006).
In a previous work (Pe
˜
na-Ortiz et al., 2005) we
proposed the GUERNICA approach to model the dy-
namism of the web workload based on users’ behav-
ior models. This paper extends that work in sev-
eral ways. First, we review the state of art on web
workload generators, software products that are de-
signed and implemented to generate web workload,
and classify them in three main groups according to
their capability to generate web workload and/or to
model the users’ dynamic behavior. Second, we com-
pare GUERNICA to other web workload generators.
The study reveals that five web workload generators
present some capabilities to reproduce this dynamism,
but only our approach improves the dynamic work-
load generation by using users’ behavior models.
The remaining of this paper is organized as fol-
lows. Section 2 reviews, analyzes and classifies a rep-
resentative subset of workload generators. Section 3
summarizes highlights some GUERNICA character-
istics. Finally, Section 4 presents some concluding
remarks.
2 WEB WORKLOAD
GENERATORS
Dynamic web applications and services have induced
continuous changes in users’ navigation patterns, thus
making it difficult to characterize the web workload.
Currently, we can reproduce the client behavior by
using either traces or workload models. Traces log the
sequence of HTTP requests and commands received
by a web application or a web server during a certain
period of time and under specific conditions. There-
fore, traces are obtained in a particular environment
(e.g., server process speed or network bandwidth) for
a specific application. This means that if any system
parameter changes (e.g., new contents in a news por-
tal), the resulting trace might differ. Therefore, the
main concern in trace-based study is the representa-
tiveness potentially achieved, especially when the re-
quests received by a given web server exhibit a high
variability because they only reproduce a subset of
their clients. Consequently, these models are not ap-
propriate to model changes in the client behavior.
On the other hand, parameterizable workload
models are abstractions of the real workload that
hide those characteristics not relevant for a particular
study. Unlike the previous models, this kind of mod-
els is accurate enough when reproducing the client
behavior or when evaluating the performance of web
applications. Parameterizable workload models gen-
erate sequences of HTTP requests similar to the real
sequences and different scenarios can be configured
by properly setting the corresponding parameters.
The representation of the dynamic users’ behavior
has been addressed in some web workload characteri-
zation studies when defining dynamic web site bench-
marks (Amza et al., 2002) or when studying the per-
formance and scalability of the technology used to de-
velop dynamic sites (Cecchet et al., 2002). However,
user dynamism is complex by nature, and current re-
sults are still far from being precise and satisfactory.
Therefore, more research efforts must be done in this
direction in order to provide more accurate tools to
model and evaluate web performance. In this sense,
the Customer Behavior Model Graph (Menasc
´
e and
Almeida, 2000) was introduced to be used as input to
a workload model of an e-business site.
Workload models are the basics of workload gen-
erators. They are software products designed and
implemented to generate HTTP requests. They are
flexible tools useful to address tuning or capacity
planning studies but, unfortunately, current genera-
tors only represent the users’ dynamic behavior in a
partial way; for instance, they cannot represent user
navigation changes as a response to the quality of the
content, the quality of its generation, or the character-
WEBIST 2011 - 7th International Conference on Web Information Systems and Technologies
120