research. An even more recent tool is photographic surveillance of the task
performer’s environment through a wearable camera carried by the study subjects. In
addition to analysing traditional means of data collection, the present study analyses a
modern logging system, the PLogger, and a wearable camera, the SenseCam, as tools
for data collection for studies on task-based information access in molecular
medicine.
The PLogger, developed at the University of Tampere, Finland [5] is a tool that
logs a Web browser's search history on an external database server. It was developed
as a research tool to collect history logs as a time ordered list of URL-addresses. It
allows the subjects to edit their logs before submitting them to the researcher.
The SenseCam, developed by Microsoft Research in Cambridge U.K. [3] is a small
wearable camera that is worn around the neck. This camera passively captures images
from the perspective of the user. Images are taken quite frequently (approximately 3
per minute), thus an extensive visual diary of one’s day is recorded. In a typical day a
user will capture 2,000 images.
Employed together, these data collection methods yield a lot of data for analysis,
but also support triangulation to achieve a comprehensive understanding of
information access in molecular medicine. The present paper examines the strengths
and weaknesses of the traditional and the more novel methods of data collection and
addresses the challenges in their application in the study of task-based information
access. In particular, we focus on how observation, logging and the SenseCam affect
the subjects’ behavior and support triangulation for greater reliability. We draw our
data collection experiences mainly from an empirical study on information access in
molecular medicine [6]. A part of this study was based on Web questionnaires,
interviews, shadowing, logging and use of the SenseCam at an anonymous research
institute in Finland. The following treatise reflects the lessons learned in applying the
data collection methods and seeks to contribute an understanding on the needs for
multiple methods and on the challenges involved in applying them.
The paper is structured as follows. Sections 2-6 discuss the data collection methods
individually and Section 7 discusses triangulation. Conclusions are given in Section 8.
2 Web Questionnaires
To start the empirical study, we distributed a Web questionnaire in summer 2007 to
the respondents to find out about the information environment of the study subjects,
the kinds of tasks they conducted, their publication plans, and difficulties experienced
in information access. The questionnaire was designed in co-operation with a senior
group leader working at the target institute. An invitation and motivation letter was
sent by e-mail to the researchers of the institute, then three reminders followed by
personal emails to some researchers. As often is the case with Web questionnaires, it
was challenging to raise the response rate above 50%.
In general, we may echo the known benefits of questionnaires which are (a)
reaching large groups of people quickly and simultaneously, (b) ease and economy of
collecting the data, (c) obtaining data that is structured and easy to handle, and (d)
supportive of statistical analysis. However, we also experienced the known challenges
of questionnaires, including (a) low response rates due to lack of motivation, (b)
50