blind to anomalies in data, or wary of intuition,
which nevertheless may have human decision mak-
ers consider clues that the algorithm could not in-
clude (Cukier and Mayer-Sch
¨
onberger, 2013).
Our experiment is based on the conjecture that vi-
sual metaphors make the interpretation of the mes-
sage more intuitive, if not more accurate, and that in-
tuition can be more important of accuracy in many in-
stances of naturalistic decision making (Klein, 2008).
In particular, we focused on three metaphors, or ef-
fects, with which to create a VV: blur, noise and trans-
parency, defined as follows:
1. Blur: the effect that makes contours less defined
creating an out of focus effect;
2. Noise: the random substitution of image pixels
with blank pixel;
3. Transparency: the overlapping of the image with
the background color;
3.2 User Test Design
To test our research questions, we developed a sim-
ple Web-based tool to create VVs
3
. This tool accepts
any raster picture as input, all together with a proba-
bility value (in percent terms) as parameter; as output
it yields the same picture affected by one of the above
image effects proportionally to the percentage indi-
cated: 100% was associated with the original image
(no effect: 100% purity
4
); while 0% was associated
with a highly distorting effect (see Figure 2) and max-
imum uncertainty. In doing so, we generated a set of
6 VVs, corresponding to different effect percentages,
that is almost nil, first quartile, close to 50%, third
quartile, and almost 100%, respectively: 10, 25, 40,
60, 75, 90.
We then developed an online questionnaire that
could display the above VVs to a number of respon-
dents, mostly bachelor students and acquaintances
whom we invited during class and by email. Respon-
dents participated voluntarily, with no incentives. In
this questionnaire, participants were invited to asso-
ciate each of six different VVs (for each image ef-
fect, 2 VVs randomly chosen from the above gener-
3
This simple tool, and its code, are available at the
following addresses: https://github.com/PinkLaura/
pixel-e-percentuali and https://pinklaura.github.io/
pixel-e-percentuali/, respectively.
4
In the pilot test we observed as the respondents found
more natural to cope with the concept of image purity rather
than with the (complementary) concept of image fuzziness
(as a proxy for uncertainty). Thus, we decided that the purer
an image could be perceived, the lower the associated level
of uncertainty, and the higher the probability or risk score
that the user should try to guess by looking at the image.
ated ones), in two tasks of increasing difficulty. For
both tasks, we showed a three-VV reference set that
indicated a 0%, 50% and 100% value, respectively.
In the first task, the respondents were invited to
select whether the VV, with respect to the reference
set, represented a value either clearly higher, perhaps
higher, perhaps lower or clearly lower than 50% (i.e,
the threshold for purely random decisions). We called
this the relative accuracy (RA) task (in that it regards
accuracy with respect to the random threshold).
In the second task, the respondents were invited
to indicate the exact underlying probability value that
the VV was expressing, by means of a slider ranging
from 0 to 100. We denoted the second task as the
absolute accuracy (AA) task.
Thus, each respondent had to perform two RA
tasks and two AA tasks for each effect, for a total
number of 12 tasks. In particular, for the RA tasks,
we defined 2 measures of accuracy (or VV effective-
ness): the rate of adequately accurate responses (ade-
quate accuracy); and the rate of approximately accu-
rate responses (approximate accuracy). The former
accuracy was defined for the different percent values
differently: the ratio between the total number of re-
sponses and the number of the respondents who an-
swered clearly higher, perhaps higher, perhaps lower
and clearly lower for, respectively, the 90%-, 60%-,
40%- and 10%-VV, and any type of higher-than-50%
(and respectively, lower-than-50%) attribute for the
75%-VV (and 25%-VV). For these two latter VVs we
did not define the approximate accuracy, which was
defined for the 90%-, 60%-VV (and 40%- and 10%-
VV) in terms of the number of higher-than-50% (and
respectively, lower-than-50%) responses.
We expected two possible sources of bias that
could affect our analysis: the order of the questions
(i.e., order bias), and the value of percentage shown
(i.e., sampling bias). In order to mitigate the for-
mer kind of bias, the online questionnaire was imple-
mented to present the 3 different effects to the respon-
dents in random order.
4 RESULTS
More than 100 respondents participated in the user
study, in the age range 19-30. Since the sample en-
compassed only bachelor or master students aged be-
tween 19 and 30, we did not stratify the respondents
on the basis of age or education level. We decided
to remove from the sample the respondents who did
not conclude the questionnaire, considering this evi-
dence of low commitment in task execution. The size
of the final sample, after cleaning it from partial an-
IVAPP 2020 - 11th International Conference on Information Visualization Theory and Applications
212