KEEPING TRACK OF HOW USERS USE CLIENT DEVICES
An Asynchronous Client-Side Event Logger Model
Vagner Figuerêdo de Santana and Maria Cecilia Calani Baranauskas
Institute of Computing, State University of Campinas, Albert Einstein Street, Campinas, Brazil
Keywords: Event logger, Usage logger, Client-side logger.
Abstract: Web Usage Mining usually considers server logs as a data source for collecting patterns of usage data. This
solution presents limitations when the goal is to represent how users interact with specific user interface
elements, since this approach may not have detailed information about users’ actions. This paper presents a
model for logging client-side events and an implementation of it as a websites evaluation tool. By using the
model presented here, miner systems can capture detailed Web usage data, making possible a fine-grained
examination of Web pages usage. In addition, the model can help Human-Computer Interaction
practitioners to log client-side events of mobile devices, set-top boxes, Web pages, among other artefacts.
1 INTRODUCTION
Several studies have addressed Web usage, ranging
from Usability and Accessibility (A&U) guidelines
to tools that analyze code, content, or logs of
websites. Additionally, Data Mining is the “analysis
of (often large) observational data sets to find
unsuspected relationships and to summarize the data
in novel ways that are both understandable and
useful to the data owner” (Hand et al., 2001). The
use of Data Mining techniques over Web usage data
is called Web Usage Mining (WUM).
WUM algorithms and tools focus mainly on Web
server logs. On the one hand, server-side data makes
possible to identify the user route in a website
requiring less effort, since Web server logs are a
natural product of its use. However, server-side logs
do not contain representative data about the
interactions between the user and the Web page
(Etgen and Cantor, 1999). On the other hand, client-
side data have more detailed information about user
actions in a Web page, but require more effort to
capture and transfer the data.
User Interface (UI) events are natural results of
using windows based interfaces and their
components (e.g., mouse movements, key strokes,
mouse clicks, list selection, etc) (Hilbert and
Redmiles, 2000). Additionally, event logs produce
results as frequency of use of certain functions,
places where users spend more time and the
sequence that users complete their tasks (Woo and
Mori, 2004). Since it is possible to record them and
they indicate the user’s behaviour, they represent an
important source of information regarding usability.
Nowadays, HCI (Human-Computer Interaction)
community count on tools that keep track of users
behaviour using mouse tracks (e.g., MouseTrack
(Arroyo el al., 2006)) and eye tracks (e.g., eyebox2
(Skeen, 2007)), but literature lacks studies focusing
on data captured automatically from the whole
diversity of users. In particular, accessibility
evaluation tools need to address some aspects
usually not covered by evaluation tools based on
mouse events or visual display. How tools that use
mouse or user's eye movements would keep track of
screen readers users?
In this context, we present a model to log client-
side data and get as many different events as
possible, since with a large vocabulary of events,
researchers can perform a wider range of analysis.
Then a case of study implementation is presented as
part of the WELFIT (Web Event Logger and Flow
Identification Tool), a tool to identify barriers that
assistive technology users face when using websites.
This work is organized as follows: the next
section presents works related to client-side events
capture; section 3 details the presented model;
section 4 discusses some implementation in the Web
context; finally, section 5 presents conclusions.
165
de Santana V. and Calani Baranauskas M. (2009).
KEEPING TRACK OF HOW USERS USE CLIENT DEVICES - An Asynchronous Client-Side Event Logger Model.
In Proceedings of the 11th International Conference on Enterprise Information Systems - Human-Computer Interaction, pages 165-168
DOI: 10.5220/0001951001650168
Copyright
c
SciTePress
2 CLIENT-SIDE LOGGERS
In this section we will discuss WET (Etgen and
Cantor, 1999) and WebRemUSINE (Paganelli and
Paternò, 2002) websites evaluation tools based on
client-side event logs.
WET focus is on performing automatic capture
of events that occur on the client-side, avoiding high
costs of time and money present in manual data
capture methods (Etgen and Cantor, 1999). The log
captured is recorded in text format at client-side (i.e.,
in cookies) and the capture depends on user actions
to capture events. Some of the points that deserve
further work involve: some way to record more data,
use a bigger event vocabulary, and independence of
user actions (Etgen and Cantor, 1999).
WebRemUSINE (Paganelli and Paternò, 2002)
makes automatic capture and analysis of websites
interaction in order to detect usability problems. The
analysis is based on the comparison between the
paths made by users and an optimum task model
previously configured (Paganelli and Paternò, 2002).
The storage and transmission of the logs is done
through a Java applet, which allowed the tool to
avoid the storage capacity of cookies (Paganelli and
Paternò, 2002). For the user, using this tool involves
splitting his/her screen into two regions, one for the
list of tasks that the participant must choose before
each task, and the other containing the website being
evaluated (Paganelli and Paternò, 2002).
The common characteristics of these tools result
in the main goals of client-side data-loggers: to
capture events at client-side and to transmit the
logged data to a server, where all analysis is
made.
3 THE PROPOSED MODEL
The model was designed so that its set up and use
require just one change in the applications to be
evaluated: a call to the client-side event logger code.
Thus, as soon as a participant starts the test session
and accepts to participate on the test, the tool starts
to record events occurred at the client-side until the
participant cancels his/her participation.
Analysis based on the requirements presented in
Santana and Baranauskas (2008) for evaluation tools
based on event logs indicated two main components
of the model: the DataLogger, responsible for
capturing event data, and the Communicator,
responsible for transmitting logs to the server. The
DataLogger is the component attached to the high
level subject to be observed (e.g., Window object),
so it can be notified to record all the events occurred.
It is inspired on the GoF (Gang of Four) Observer
Pattern (Gamma et al., 1995). The Communicator
component controls the transmission of the logged
data to the server. It also keeps the information sent
regarding the identification of logs and keeps the
server responses. Moreover, other components were
added in order to modularize the model and fulfil all
the requirements considered.
Figure 1: The client-side event logger model overview.
Due to the frequency in which UI events are
triggered, the number of events that can occur during
few minutes can be huge. Then, any data-logger that
must transmit logged data to a server should have to
compact the data. This brought the need for a
component to compact the data, the LogCompactor.
It has the role of avoiding the heavy consume of
client’s bandwidth connection, that may occur if the
raw log is transferred to the server. Also, to deal
with the amount of data recorded we used
asynchronous communication with the server as a
strategy to interfere as few as possible with the use
of the UI.
To manipulate data and perform record, read,
and remove functions, we used a DataAccessObjec,
which is based on the Data Access Object (DAO)
J2EE design pattern (Alur et al., 2003). In addition,
we needed a way to interact with the user and show
the status of the logger, responsibility of the Facade
component, inspired on the GoF Facade Pattern
(Gamma et al., 1995). To address privacy policies
we added the PrivacyFilter component, responsible
to check if the captured data can or cannot be sent to
the server based on previously defined policies (e.g.,
not record which key is pressed when a keypress
ICEIS 2009 - International Conference on Enterprise Information Systems
166
event occur). Finally, we defined a Factory to create
and assembly all the components together. The
Factory is the model creator class. It contains the
information to instantiate and to build all
components. First, the Factory instantiates itself,
following the GoF Singleton pattern (Gamma et al.,
1995), then it uses two other GoF creational patterns
to instantiate model classes: Factory Method
(Gamma et al., 1995) and Builder (Gamma et al.,
1995). The model follows the MVC (Model View
Controller) pattern (Figure 1). The main
characteristics are:
It is lightweight and depends on few resources
of the users' devices, achieving its goal in
different configurations;
It processes and transmits logs without
interfering with the use of the evaluated UI;
It uses all event available data that do not
impact on security problems;
It logs usage data without depending on
specific task models, grammars, or events;
It provides tool’s status and controls, allowing
users to interrupt the capture.
4 RESULTS
The implementation of the presented model used
JavaScript, an object-based scripting language
(Netscape, 1999), and the server module used Java
related technologies following the MVC pattern and
structure proposed in Basham et al. (2004).
At Web context, the environment configuration
of the tool requires that the website administrator
registers him/herself and the website to be evaluated,
and insert a reference to the JavaScript client-
module in the registered website’s pages. From that
point, if the reference to the client module came
from a registered website, then each time an user
access the page, the server module fills up the
Factory component script with information
unavailable from JavaScript (e.g., client's IP, a
global identifier for that session, etc.) before serving
it. Then, as soon as the script is loaded, the tool
starts to work.
The JavaScript implementation of the model
showed to be efficient and effective. However, there
were some issues related to the space available to
record information on client's device and to
exchange data between different domains. The space
available to record information in the client's device
is restricted. One solution is to use cookies, but they
are limited to a size of 4 kilobytes (kB) and each
domain can specify only 20 cookies (Netscape,
1999). When using cookies the problem is the time
required to record, retrieve, and delete the logs
without interfering with the use of the website. The
alternative found was to use the Web page structure
in memory, also known as Document Object Model
(DOM) tree. Then, application cookies where used
only to deal with error recovery and to maintain
information valid for more than one session (e.g., the
acceptance of the user).
The bigger issue implementing the proposed
model in JavaScript was the asynchronous cross-
domain transmissions. The problem was to deal with
security restrictions of the XMLHTTPRequest, a
JavaScript object widely used object in AJAX
(Asynchronous JavaScript And XML) applications,
which just allows connection between Web
pages/applications hosted at the same domain.
The security restriction is called Same Origin
Policy and “prevents document or script loaded from
one origin from getting or setting properties of a
document from a different origin” (Ruderman,
2001). The Same Origin Policy may seem too
restrictive, since it blocks the use of Web services
directly via XMLHTTPRequests. However,
allowing scripts to access any domain opens up
users to potential exploitation (Levitt, 2005a).
Some solutions to deal with this restriction are:
Signed Scripts (Ruderman, 2007), Server-side
proxy, Iframe Proxy (Dojo Toolkit, 2006), and Flash
Proxy (Levitt, 2006). These solutions are effective,
but they conflict with the requirements we are
following, since they are not browser independent,
depend on some plug-in or require a more complex
environment configuration than presented.
Some proposals that would give to JavaScript
programmers the power to perform cross-domain
requests are JSONRequest and <module> tag.
JSONRequest is proposed to be a new browser
service that allows data exchange without exposing
users or organization to harm (Crockford, 2006a).
The <module> tag proposes to divide a Web page
into a collection of modules that are secure from
each other, providing safe communication; it also
proposes how to reach a consensus on a new Web
browser security model, since Web applications are
significantly ahead of Web browsers technologies
(Crockford, 2006b).
The solution used is based on an approach
presented in Levitt (2005b). The approach simulates
a JSONRequest using Dynamic Script Tag, which
manipulates the DOM tree to perform requests
through the creation of script tags, allowing cross-
domain asynchronous communication.
Logging tests performed in pages generated by
Content Management Systems like Plone and Drupal
showed that each second of interaction results in
KEEPING TRACK OF HOW USERS USE CLIENT DEVICES - An Asynchronous Client-Side Event Logger Model
167
approximately 1kB of compacted logs. Therefore,
any participant using a connection that supports the
transmission of 1kB per second plus the average of
bandwidth connection used by the participant to surf
the Web will allow the model to behave accordingly
to the design and does not interfere with the use of
the website. If it is not the case, the accumulated
amount of log data will reach a configuration limit
and the tool will became inactive. The validation of
the model was performed capturing events during
real use of the website of a research group called
Todos Nós (www.todosnos.unicamp.br), since part
of its audience uses assistive technology. The data
captured during 60 days resulted 85 recorded
sessions, 6 of them coming from assistive
technology users. The data collected resulted in
more than 270 thousands of events.
5 CONCLUSIONS
The model proposed showed to be lightweight and
addressed requirements stated in Santana and
Baranauskas (2008). Also, it can supply data to other
applications to discover usage patterns.
During implementation and use of this model,
maintainers and developers must always keep
security and privacy in mind, since the information
being captured and transmitted can be critic if it is
not filtered and/or made in a safer way. Accordingly,
users must always be aware of what is happening in
their device and accept to participate before the
logger starts to record events, since the free record
of this kind of information without warning the user
would characterize the tool as a spyware.
The main advantages of the presented model in
comparison with the approaches presented in section
2 are: modularization and configurability of all
components, browser and plug-in independent, and
compaction, which was not cited in referred works.
Improvements may be obtained through different
data compression techniques and transmission plug-
in independent approaches allowing the inclusion of
security mechanisms.
ACKNOWLEDGEMENTS
Proesp/CAPES and FAPESP.
REFERENCES
Alur, D., Crupi, J. and Malks, D., 2003. Core J2EE
Patterns: Best Practices and Design Strategies. 2
nd
Edition. Prentice Hall PTR.
Arroyo, Ernesto; Selker, T. and Willy, W., 2006. Usability
Tool for Analysis of Web Designs Using Mouse
Tracks. Work-in-Progress In Proc. of CHI 2006.
Basham, B., Sierra, K., Bates, B. 2004. Head First
Servlets and JSP: Passing the Sun Certified Web
Component Developer Exam (SCWCD). O’Reilly.
Crockford, D., 2006a. JSONRequest. Available at:
http://json.org/JSONRequest.html
Crockford, D., 2006b. The <module> Tag. Available at:
http://www.json.org/module.html
Dojo Toolkit, 2006. Cross Domain XMLHttpRequest
using an IFrame Proxy. Available at:
http://dojotoolkit.org/node/87
Etgen, M. and Cantor, J., 1999. What does getting wet
(web event-logging tool) mean for web usability? In:
Proc. of 5th Conf. on Human Factors & the Web.
Gamma, E., Helm, R., Johnson, R. and Vlissides, J., 1995.
Design Patterns: Elements of Reusable Object
Oriented Software. Reading: Addison Wesley.
Hand, D., Mannila, H. and Smith, P., 2001. Principles of
Data Mining. MIT Press.
Hilbert, D.M. and Redmiles, D.F., 2000. Extracting
usability information from user interface events. ACM
Comput. Surv. 32(4). pp. 384–421.
Levitt, J., 2005a. Fixing AJAX: XMLHttpRequest
Considered Harmful. Available at:
http://www.xml.com/pub/a/2005/11/09/fixing-ajax-
xmlhttprequest-considered-harmful.html
Levitt, J., 2005b. JSON and the Dynamic Script Tag:
Easy, XML-less Web Services for JavaScript.
Available at: http://www.xml.com/pub/a/2005/12/21/
json-dynamic-script-tag.html
Levitt, J.,2006. Flash to the Rescue. Available at:
http://www.xml.com/pub/a/2006/06/28/flashxmlhttpre
quest-proxy-to-the-rescue.html
Netscape Communications Corporation, 1999. Client-Side
JavaScript Reference.
Paganelli, L. and Paternò, F., 2002. Intelligent analysis of
user interactions with web applications. In: IUI ’02:
Proc. of the 7th Int. Conf. on Intelligent User
Interfaces, ACM. pp. 111–118.
Ruderman, J., 2001. The Same Origin Policy. Available
at:http://www.mozilla.org/projects/security/component
s/same-origin.html
Ruderman, J., 2007. Signed Scripts in Mozilla. Available
at:http://www.mozilla.org/projects/security/component
s/signed-scripts.html
Santana, V.F. de and Baranauskas, M.C.C. (2008) A
Prospect of Websites Evaluation Tools Based on
Event Logs. In IFIP, Volume 272; Human-Computer
Interaction Symposium; Springer, pp. 99–104.
Skeen, D., 2007. Eye-Tracking Device Lets Billboards
Know When You Look at Them. Available at:
http://www.wired.com/gadgets/miscellaneous/news/20
07/06/eyetracking.
Woo, D. and Mori, J., 2004. Accessibility: A tool for
usability evaluation. In APCHI. Volume 3101 of
LNCS, Springer. pp. 531–539
ICEIS 2009 - International Conference on Enterprise Information Systems
168