between the category itself and the
characteristics associated with that category.
This is to reinforce the idea that these
characteristics are truly correctly associated
with their.
o Third: The set of characteristics is used to
create homogeneous groups of
characteristics (denominated as factors).
These groups are formed of those
characteristics which have a considerable
amount of correlation with each other. We
also attempt to ensure that each group is
independent of the others.
o Finally: User groups (denominated as
clusters) are created, which are related to the
factors. Each user group is formed of
different types of users, according to the
demographic aspects. The DQ characteristics
which are most important to different types
of users are therefore grouped together.
• Guidelines containing recommendations for Web
portal designers and developers will be created,
which would allow them to discover user
preferences.
The process used in the case of the Dynamic
approach will be as follows:
• A set of measures will be defined for each
characteristic.
• The level of quality desired for each characteristic
will be determined.
• Recommendations by which to improve the level
of data quality in said Web portal will be
generated for each DQ characteristic.
• The measures will be automated through the
creation of a computerised tool.
• The tool will be made available for direct use by
any interested parties in specific Web portals.
In this paper we shall show the application of the
Static approach to the Intrinsic DQ category.
4 THE INTRINSIC DQ IN WEB
PORTALS
In this section we shall use the Static approach in the
Intrinsic DQ category (see Figure 1). The questionnaire
in this survey was made up of a total of 20
questions, 16 of which were related to the DQ
characteristics in the category (also including a
question concerning the definition of the term
‘Intrinsic’), and the other 4 of which were related to
demographic aspects. We used closed questions, in
which only one response was possible.
The 16 questions were created using a 5-point
Likert scale, where 1 is “Not at all important” and 5
is “Very important”.
The questions related to demographic aspects, on
the other hand, allowed us to define the different
types of users. These types of users were obtained
according to their gender, age range, level of studies
and knowledge of computing.
The questionnaire was of the unsupervised type,
and was distributed to a heterogeneous group of 150
Web portal users. The results of the questionnaires
are analysed in the following section.
5 ANALYSIS OF THE RESULTS
This section shows the study of the results obtained
from the questionnaires. Of a total of the 137
questionnaires received, the data from 136 were
analysed, since the questions relating to the
demographic aspects had not been answered in one
of them. This study was carried out by using an
SPSS statistical analysis tool.
The objective of our study is, on the one hand to
determine whether all the DQ characteristics in the
Intrinsic DQ category are important for Web portal
users and, on the other, to analyse whether some
characteristics are more relevant than others
according to the different types of users.
As a starting point, it was necessary to estimate
the reliability of the results. This was done by
calculating the Cronbach’s alpha. The result
obtained from this was a value of 0.856, which
indicated that the results had good internal
consistence. The information is therefore reliable.
A descriptive statistical analysis was then carried
out in order to determine whether all the DQ
characteristics are important to Web portal users.
This analysis allowed us to obtain the central tendency
(mean) and the dispersion (typical deviation) for all the
variables in the study. The variables correspond with
the characteristics that are included in the Intrinsic
category, along with the Intrinsic DQ category itself.
As a result of this, we observed that the mean value,
of all the characteristics, is approximately four. As
additional data, we obtained that the characteristics
Credibility, Accessibility, Reputation, Consistency
and Currentness have a higher mean value. The only
characteristic that was below the mean value of four
was that of Traceability. Although the value for this
category was below four, it had a mean of 3.96,
which is very close to four, and it is therefore also
considered to be important.
DEFININGTHEINTRINSICDATAQUALITYFORWEBPORTALS
377