“don’t know” responses more often than young
people. Older people are said to be very cautious and
do not draw on contextual information (preceding
questions), unlike young and middle-aged people. In
spite of these efforts, very little research has been
done on Likert scales. Nevertheless, they are
frequently used in standard usability questionnaires
(e.g.; QUIS) and exploring the requirement of
adaptation is therefore worthwhile.
2 OVERVIEW OF THE STUDY
This study was carried out during an ICT course at
La Verneda, an adult school in Barcelona (Spain).
Elderly people running ICT courses asked us to
organize a MS Word session. They often use MS
Word in their teaching activities at the school.
Through contextual interviews carried out at the
school, we found out that they use MS Word mostly
to deal with lists of participants (students) and create
user manuals. Hence, we aimed to focus the MS
Word session on these aspects. The following word-
processing functionalities were evaluated after a
two-hour hands-on session where the participants
were asked to create a user manual about “how to
copy a text from a web page to a Word document”, a
frequent task carried out in the courses at La
Verneda: (i) creating a table of contents; (ii) saving a
word document as an HTML file; (iii) adding
headings and footnotes; (iv) inserting a cross-
reference and (v) sorting a list of students in
alphabetical order.
Five elderly men ranging in age from 64 to 75
and with experience with computers participated in
the study. Although this number of users is very
small to draw significant conclusions, five users is
somehow suggested as a baseline (Nielsen, 1993) in
order to identify potential usability problems, which
will need to be validated with more users. The
participants had been organizing and running ICT
courses for the older population at La Verneda for
more than 5 years. In addition, they used a wide
range of computer applications such as e-mail and
MS Word on a daily basis.
An evaluation questionnaire was designed to
elicit feedback from the participants on the usability
of the word-processing functionalities. The
questionnaire was divided into five sections (one
section per scenario or task). Qualitative analysis is
based on observation-notes taken during the session.
3 ADAPTING STANDARD
LIKERT SCALES TO THE
YOUNG ELDERLY
In order to identify errors in the formulation of
questions and responses (i.e. standard Likert scales)
in the questionnaire, a pilot evaluation inspired by
cognitive interviewing laboratory techniques was
carried out. 7 users participated in this test, 5 elderly
adults (2 women; 3 men) and 2 middle-aged people
(1 woman; 1 man). None of them took part in the
usability evaluation. All the participants were asked
to fill in the pilot questionnaire individually and
paraphrase (i.e.; to put it into another way) those
questions and responses which were difficult for
them to understand. Afterwards, they were
interviewed individually in order to gather their
feedback (e.g.; is there any question did you find
difficult to understand?) and to probe their
understanding of both questions and responses (e.g.;
what do you think this question means?).
The results pointed out that the elderly
participants had difficulties understanding the
meaning of standard Likert scales. Although we still
do not have a full reason for it, they associated 1
with “the best” and 5 with “the worst” independently
of the adjectives used in the Likert scales. In the
interviews they explained to us that they rely on life
experience a lot in order to answer questionnaires.
Nevertheless, everyday and Likert scales differ in
several ways. In everyday scales, such as the
‘Spanish National football league’ or ‘the top ten
richest women’, the elements are usually arranged
vertically. In addition, the top elements are usually
regarded as the best or the most. These two factors
turned out to be key differences as compared to
standard Likert scales.
In order to test the impact of these differences on
the validity of the responses elicited, a second pilot
questionnaire was designed and evaluated with the
same users. The questions and responses were the
same in both questionnaires. But the responses were
arranged vertically in the second. This has a strong
effect on the validity of the results: different
responses were elicited from the elderly users. For
the same question, they selected the number “1” in
the first questionnaire (horizontal scale) and the
number “5” in the second (vertical scale).
Nevertheless, our post-interviews revealed that they
aimed to give the same answer in both
questionnaires.
With the aim of overcoming validity concerns
we had two alternatives: (i) to replace horizontal
with vertical scales; (ii) to replace numbers with
adjectives. Although both alternatives were brought
up by our users during the post-interviews, all of
AN INITIAL USABILITY EVALUATION OF SOME WORD-PROCESSING FUNCTIONALITIES WITH THE
ELDERLY
329