
 
“don’t know” responses more often than young 
people. Older people are said to be very cautious and 
do not draw on contextual information (preceding 
questions), unlike young and middle-aged people. In 
spite of these efforts, very little research has been 
done on Likert scales. Nevertheless, they are 
frequently used in standard usability questionnaires 
(e.g.; QUIS) and exploring the requirement of 
adaptation is therefore worthwhile. 
2  OVERVIEW OF THE STUDY 
This study was carried out during an ICT course at 
La Verneda, an adult school in Barcelona (Spain). 
Elderly people running ICT courses asked us to 
organize a MS Word session. They often use MS 
Word in their teaching activities at the school. 
Through contextual interviews carried out at the 
school, we found out that they use MS Word mostly 
to deal with lists of participants (students) and create 
user manuals. Hence, we aimed to focus the MS 
Word session on these aspects. The following word-
processing functionalities were evaluated after a 
two-hour hands-on session where the participants 
were asked to create a user manual about “how to 
copy a text from a web page to a Word document”, a 
frequent task carried out in the courses at La 
Verneda: (i) creating a table of contents; (ii) saving a 
word document as an HTML file; (iii) adding 
headings and footnotes; (iv) inserting a cross-
reference and (v) sorting a list of students in 
alphabetical order. 
Five elderly men ranging in age from 64 to 75 
and with experience with computers participated in 
the study. Although this number of users is very 
small to draw significant conclusions, five users is 
somehow suggested as a baseline (Nielsen, 1993) in 
order to identify potential usability problems, which 
will need to be validated with more users. The 
participants had been organizing and running ICT 
courses for the older population at La Verneda for 
more than 5 years. In addition, they used a wide 
range of computer applications such as e-mail and 
MS Word on a daily basis. 
An evaluation questionnaire was designed to 
elicit feedback from the participants on the usability 
of the word-processing functionalities. The 
questionnaire was divided into five sections (one 
section per scenario or task). Qualitative analysis is 
based on observation-notes taken during the session. 
3  ADAPTING STANDARD 
LIKERT SCALES TO THE 
YOUNG ELDERLY 
In order to identify errors in the formulation of 
questions and responses (i.e. standard Likert scales) 
in the questionnaire, a pilot evaluation inspired by 
cognitive interviewing laboratory techniques was 
carried out. 7 users participated in this test, 5 elderly 
adults (2 women; 3 men) and 2 middle-aged people 
(1 woman; 1 man). None of them took part in the 
usability evaluation. All the participants were asked 
to fill in the pilot questionnaire individually and 
paraphrase (i.e.; to put it into another way) those 
questions and responses which were difficult for 
them to understand. Afterwards, they were 
interviewed individually in order to gather their 
feedback (e.g.; is there any question did you find 
difficult to understand?) and to probe their 
understanding of both questions and responses (e.g.; 
what do you think this question means?). 
The results pointed out that the elderly 
participants had difficulties understanding the 
meaning of standard Likert scales. Although we still 
do not have a full reason for it, they associated 1 
with “the best” and 5 with “the worst” independently 
of the adjectives used in the Likert scales. In the 
interviews they explained to us that they rely on life 
experience a lot in order to answer questionnaires. 
Nevertheless, everyday and Likert scales differ in 
several ways. In everyday scales, such as the 
‘Spanish National football league’ or ‘the top ten 
richest women’, the elements are usually arranged 
vertically. In addition, the top elements are usually 
regarded as the best or the most. These two factors 
turned out to be key differences as compared to 
standard Likert scales. 
In order to test the impact of these differences on 
the validity of the responses elicited, a second pilot 
questionnaire was designed and evaluated with the 
same users. The questions and responses were the 
same in both questionnaires. But the responses were 
arranged vertically in the second. This has a strong 
effect on the validity of the results: different 
responses were elicited from the elderly users. For 
the same question, they selected the number “1” in 
the first questionnaire (horizontal scale) and the 
number “5” in the second (vertical scale). 
Nevertheless, our post-interviews revealed that they 
aimed to give the same answer in both 
questionnaires. 
With the aim of overcoming validity concerns 
we had two alternatives: (i) to replace horizontal 
with vertical scales; (ii) to replace numbers with 
adjectives. Although both alternatives were brought 
up by our users during the post-interviews, all of 
AN INITIAL USABILITY EVALUATION OF SOME WORD-PROCESSING FUNCTIONALITIES WITH THE
ELDERLY
329