In order to keep the effort for the study in a feasi-
ble frame, we had to cut down on some evaluation pa-
rameters which would have otherwise been performed
in a more comprehensive fashion: (1) The study used
only two websites as evaluation samples. (2) Only
one web page was taken as a sample for each web-
site. For an evaluation that leads to official certifica-
tion, the evaluator has to select a representative sam-
ple of the website. (3) Both websites were evaluated
by only one evaluator. (4) Furthermore, both websites
were information-oriented. The results could differ
for evaluations of complex web applications.
It should also be noted that the evaluator was an
employee of T-Systems, the company that owns the
BITV-Audit. This could have led to a bias towards
this method. On the other hand, the person acting as
QA for the BIK BITV-Test, was an employee of HdM.
In the context of this study, primarily the perspec-
tive of the evaluators and organizations conducting
the evaluations were taken. It could be argued that
by incorporating the client’s perspective to a greater
extend, the results would be more comprehensive.
7 CONCLUSION
In this paper, a systematic comparison of two exist-
ing manual evaluation methods was conducted us-
ing real-world data involving accessibility experts and
two exemplary websites. For this purpose, we cre-
ated a generic catalog of comparison criteria based
on the expertise of various accessibility experts. On
the basis of this catalog, we compared two evaluation
methods in terms of their suitability and effective-
ness: The BIK BITV-Test as one of the best-known
conformance-based evaluation methods in Germany
and the BITV-Audit of T-Systems MMS as an exam-
ple of the empiric-based evaluation methods.
In comparison, the BITV-Audit performs better
than the BIK BITV-Test based on the defined cata-
log and the specific weightings determined by the ac-
cessibility experts involved in the study. However, it
should be noted that no universally valid weighting
can be defined for all possible situations. Therefore, if
necessary, our weights may be replaced by individual
weights and so used to recalculate the comparison for
both methods, on a case-by-case basis. Thus, this pa-
per can assist in deciding which evaluation method is
more appropriate in a particular situation. Addition-
ally, the discussed weaknesses and strengths of each
method can assist in making a decision.
Also, the results show the following major sim-
ilarities between the two evaluation methods: Both
fully cover the WCAG 2.1 success criteria of confor-
mance level A and AA. Furthermore, they have sim-
ilar values in the criteria of tool support and simplic-
ity of the evaluation procedure. Strong differences
are found in the following areas: Coverage of addi-
tional criteria from various standards (EN 301 549,
WCAG 2.1 AAA, BITV 2.0; in favor of BITV-Audit),
the scope of optional input formats (in favor of BITV-
Audit), publicity of the evaluation procedure (in favor
of BIK BITV-Test), licensing conditions (in favor of
BIK BITV-Test), and organizational requirements in
order to gain permission to use the evaluation method
(in favor of BIK BITV-Test).
Moreover, we have observed that WCAG 2.1 does
not consider all usability problems which were iden-
tified through expert assessments.
In this work, we carefully defined a catalog of cri-
teria based on 22 criteria and common standards to-
gether with experts. Nevertheless, it is conceivable
that further criteria and metrics may be of high rele-
vance in the future. The results of this work can al-
ways serve as a basis for possible future extensions in
this area.
REFERENCES
Brajnik, G. (2005). Accessibility assessments through
heuristic walkthroughs. SIGCHI-Italy, page 77.
Brajnik, G. (2006). Web accessibility testing: When the
method is the culprit. In International Conference on
Computers for Handicapped Persons, pages 156–163.
Springer.
Brajnik, G. (2008). A comparative test of web accessibil-
ity evaluation methods. In Proceedings of the 10th
International ACM SIGACCESS Conference on Com-
puters and Accessibility, Assets ’08, page 113–120,
New York, NY, USA. Association for Computing Ma-
chinery.
Brajnik, G. (2009). Validity and reliability of web acces-
sibility guidelines. In Proceedings of the 11th Inter-
national ACM SIGACCESS Conference on Comput-
ers and Accessibility, Assets ’09, page 131–138, New
York, NY, USA. Association for Computing Machin-
ery.
Brajnik, G., Mulas, A., and Pitton, C. (2007). Effects of
sampling methods on web accessibility evaluations.
In Proceedings of the 9th International ACM SIGAC-
CESS Conference on Computers and Accessibility,
Assets ’07, page 59–66, New York, NY, USA. Asso-
ciation for Computing Machinery.
Burkard, A., Zimmermann, G., and Schwarzer, B. (2021).
Monitoring systems for checking websites on accessi-
bility. Frontiers in Computer Science, 3:2.
Calvo, R., Seyedarabi, F., and Savva, A. (2016). Beyond
web content accessibility guidelines: Expert accessi-
bility reviews. In Proceedings of the 7th International
Conference on Software Development and Technolo-
CHIRA 2021 - 5th International Conference on Computer-Human Interaction Research and Applications
34